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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 



1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 

10 lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/amino acid sequence of the pfrotein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

15 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, for 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

25 and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
3 0 cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybiidomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
3 5 polynucleotides and cells genetically engineered to express such polynucleotides. 

l 
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The present invention relates to a collection or library of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 

* 

The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 
5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 
sequences are designated as SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. The 
polypeptides sequences are designated SEQ ID NO: 985-1 968, 2953-3936, 3943-3948 or 3955- 
3960. The nucleic acids and polypeptides are provided in the {Sequence Listing. In the nucleic acids 
provided in the Sequence Listing, A is adenosine; C is cytosine; G is guanine; T is thymine; and N 
10 is any of the four bases. In the amino acids provided in the Sequence Listing, * corresponds to the 
stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridizetothecomplementofSEQIDNO: 1-984, 1969-2952, 3937-3942 or 3949-3954 under 
stringent hybridization conditions; nucleic acid sequences which are allelic variants or species 
1 5 homologues of any of the nucleic acid sequences recited above, or nucleic acid sequences that 
encode a peptide comprising a specific domain or truncation of the peptides encoded by SEQ ID 
NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. A polynucleotide comprising a nucleotide 
sequence having at least 90% identity to an identifying sequence of SEQ ID NO: 1-984, 1969-2952, 

♦ 

3937-3942 or 3949-3954 or a degenerate variant or fragment thereof. The identifying sequence can 

20 be 100 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 
from the nucleic acid sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. The 
sequence information can be a segment of any one of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 
3949-3954 that uniquely identifies or represents the sequence information of SEQ ID NO: 1-984, 

25 1 969-2952, 3937-3942 or 3949-3954. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 
a nucleic acid array. In one embodiment, segments of sequence information is provided on a 
nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 

30 to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readableformat 

* 

This invention also includes the reverse or direct complement of any of the nucleic acid 
sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
35 reverse or direct complements) according to the invention have numerous applications in a variety 
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of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readablemedia, use in sequencing 
full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DNA or RNA, their chemical analogs and the like. 
5 In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1 -984, 1 969-2952, 

3937-3942 or 3949-3954 or novel segments or parts of the nucleic acids of the invention are used as 
primers in expression assays that are well known in the art. In a particularly preferred embodiment, 
the nucleic acid sequences of SEQ ID NO: 1 -984, 1 969-2952, 3937-3942 or 3949-3954 or novel 
segments or parts of the nucleic acids provided herein are used in diagnostics for identifying 

10 expressed genes or, as well known in the art and exemplified by Vollrath et al., Science 258:52-59 
(1 992), as expressed sequence tags for physical mapping of the human genome. 

The isolated polynucleotides of the invention include, but are not limited to, a 
polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ED NO: 1 -984, 
1 969-2952, 3937-3942 or 3949-3954 ; a polynucleotide comprising any of the full length protein 

1 5 coding sequences of SEQ ID NO: 1 -984, 1 969-2952, 3937-3942 or 3949-3954; and a polynucleotide 
comprising any of the nucleotide sequences of the mature protein coding sequences of SEQ ID 
NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. The polynucleotides of the present invention also 
include, but are not limited to, a polynucleotide that hybridizes under stringent hybridization 
conditions to (a) the complement of any one of the nucleotide sequences set forth in SEQ ID NO: 1 - 

20 984, 1 969-2952, 3937-3942 or 3949-3954; (b) a nucleotide sequence encoding any one of the 
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 
(e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a 
polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an 

25 amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in SEQ ID NO: 985-1968, 2953-3936, 3943- 
3948 or 3955-3960; or the corresponding lull length or mature protein. Polypeptides of the 
invention also include polypeptides with biological activity that are encoded by (a) any of the 

30 polynucleotideshavinganuc!eotidesequencesetfordiinSEQIDNO:l-984 t 1969-2952, 3937- 
3942 or 3949-3954; or (b) polynucleotides that hybridize to the complement of the polynucleotides 
of (a) under stringent hybridization conditions. Biologically or immunologically active variants of 
any of the polypeptide sequences in the Sequence Listing, and "substantial equivalents" thereof 
(e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% amino acid sequence 

35 identity) that preferably retain biological activity are also contemplated. The polypeptides of the 
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invention may be wholly or partially chemically synthesized but are preferably produced by 
recombinant means using the genetically engineered cells (e.g. host cells) of the invention. 

The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
5 hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 
the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 

1 0 under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 
protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 

15 as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 
or RNA, their chemical analogs and the like. For example, when the expression of anmRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 

20 using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et aL, Science 258:52-59 (1992), as expressed sequence tags for physical 

* 

mapping of the human genome. 

25 The polypeptides according to the invention can be used in a variety of conventional 

procedures and methods that are currently applied to other proteins. For example, a polypeptide 
of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 

30 markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 
which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutical^ acceptable carrier. 
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In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
expression or biological activity. 

The present invention further relates to methods for detecting the presence of the 
5 polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 
a method for detecting the polynucleotides of the invention in a sample, comprising contacting 
the sample with a compound that binds to and forms a complex with the polynucleotide of 
10 interest for a period sufficient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 

■ 

invention in a sample comprising contacting the sample with a compound that binds to and forms 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 
1 5 and detecting the formation of the complex such that if a complex is formed, the polypeptide is 
detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
antibodies, and optionally quantitative standards, for carrying out methods of the invention* 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
20 monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
(i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compounds 

25 that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with {e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 

30 complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 

detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
identified. 

The methods of the invention also provides methods for treatment which involve the 

* 

35 administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
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symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
modulate the overall activity of the target gene products. Compounds and other substances can 
effect such modulation either on the level of target gene/protein expression or target protein 
5 activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Tables 2 and 9); for which they have 
a signature region (as set forth in Tables 3 and 10); or for which they have homology to a gene 
10 family (as set forth in Tables 4 and 1 1). If no homology is set forth for a sequence, then the 

polypeptides and polynucleotides of the present invention are useful for a variety of applications, 
as described herein, including use in arrays for detection. 



15 



4. DETAILED DESCRIPTION OF THE INVENTION 



4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
"an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
20 and/or immunologic activities of any naturally occurring polypeptide. According to the 

invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 

I 

Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 

25 appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
enzymatic molecules as part of a normal or disease process. 

The terms "c6mplementary" or "complementarity" refer to the natural binding of 

30 polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 

complementary sequence 3'-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
total complementarity exists between the single stranded molecules. The degree of 
complementarity between the nucleic acid strands has significant effects on the efficiency and 

35 strength of the hybridization between the nucleic acid strands. 
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The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term u germ line 
stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
and continuous source of germ cells for the production of gametes. The term "primordial germ 
5 cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 
not only populate the genn line and give rise to a plurality of terminally differentiated cells that 

1 0 comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 
modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 

1 5 include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 

The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 
"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 

20 sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or KNA-like material. In the 
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 

25 provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

30 The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 

"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 1 7 nucleotides. The fragment is preferably less than about 500 

35 nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
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nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30. 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 
nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
5 be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts of mKNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
IDNOs:l-20. 

10 Probes may, for example, be used to determine whether specific mKNA molecules are 

present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1 :241-250). They may 
be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 

15 Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 
entirety. 

The nucleic acid sequences of the present invention also include the sequence 

20 information from the nucleic acid sequences of SEQ ID NO: 1 -984, 1 969-2952, 3937-3942 or 
3949-3954. The sequence information can be a segment of any one of SEQ ID NO: 1-1 -984, 
1969-2952, 3937-3942 or 3949-3954 that uniquely identifies or represents the sequence 
information of that sequence of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. One 
such segment can be a twenty-mer nucleic acid sequence because the probability that a twenty- 

25 mer is fully matched in the human genome is 1 in 300. In the human genome, there are three 
billion base pairs in one set of chromosomes. Because 4 possible twenty-mers exist, there are 
300 times more twenty-mers than there are base pairs in a set of human chromosomes. Using the 
same analysis, the probability for a seventeen-mer to be fully matched in the human genome is 
approximately 1 in 5. When these segments are used in arrays for expression studies, fifteen- 

30 mer segments can be used. The probability that the fifteen-mer is fully matched in the expressed 
sequences is alsd approximately one in five because expressed sequences comprise less than 
approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome 

35 with a single mismatch is calculated by multiplying the probability for a full match (1-*4 2S ) times the 
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increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in five. 
5 The term "open reading frame," ORF, means a series of nucleotide triplets coding for 

amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
sequence if the promoter controls the transcription of the coding sequence. While operably 

10 linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 

1 5 differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 

20 preferably at least about 9 amino acids and most preferably at least about 17 or more amino 
acids. The peptide preferably is not greater than about 500 amino acids, more preferably less 
than 200 amino acids more preferably less than 150 amino acids and most preferably less than 
100 amino acids. Preferably the peptide is from about 5 to about 200 amino acids. To be active, 
any polypeptide must have sufficient length to display biological and/or immunological activity. 

25 The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 

have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

The term "translated protein coding portion" means a sequence which encodes for the full 

30 length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
produced by processing in the cell which removes any leader/signal sequence. The mature 

35 protein portion may or may not include the initial methionine residue. The methionine residue 
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may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 

The term "derivative" refers to polypeptides chemically modified by such techniques as 
5 ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
in human proteins. 

The term "variant^or "analog 59 ) refers to any polypeptide differing ftom naturally 

10 occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 

15 or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 
substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 

20 prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 
affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 

25 another amino acid having similar structural and/or chemical properties, Le. , conservative amino 
acid replacements. "Conservative** amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 
nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 

30 neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 

glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
"deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 
amino acids. The variation allowed may be experimentally determined by systematically making 
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insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
5 can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
for expression, scale up and the like in the host cells chosen for expression. For example, 

10 cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 
macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 

15 polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 

preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 

20 at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 
polypeptides present in their natural source. 

25 The term "recombinant," when used herein to refer to a polypeptide or protein, means 

that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 
defines a polypeptide or protein essentially free of native endogenous substances and 

30 unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 

35 or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
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comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 
appropriate transcription initiation and termination sequences. Structural units intended for use 
5 in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
recombinant protein to provide a final product. 

10 The term "recombinant expression system" means host cells which have stably integrated 

a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally . Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 
to the DNA segment or synthetic gene to be expressed. This term also means host cells which 

15 have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 
can be prokaryotic or eukaryotic. 

20 Hie term "secreted" includes a protein that is transported across or through a membrane, 

including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted" proteins include without limitation proteins secreted wholly 
(e.g., soluble proteins) or partially (e.g., receptors) from the cell in which they are expressed. 
"Secreted" proteins also include without limitation proteins that are transported across the 

25 membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 

proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, P.A. and 
Young, P.R. (1992) Cytokine 4(2):134 -143) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (1998) Annu. Rev. Immunol. 
16:27-55) 

30 Where desired, an expression vector may be designed to contain a "signal or leader 

sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

m 

The term "stringent" is used to refer to conditions that are commonly understood in the 
35 art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
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to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 
described herein in the examples. 
5 In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 

hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 23-base oligonucleotides). 

As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 

10 sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 
35%'(/.e., the number of individual residue substitutions, additions, and/or deletions in a 

15 substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 
listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 

20 by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no 
more than 10% (90% sequence identity) and in a further variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid 
sequences according to the invention preferably have at least 80% sequence identity with a listed 

25 amino acid sequence, more preferably at least 85% sequence identity, more preferably at least 
90% sequence identity, more preferably at least 95% sequence identity, more preferably at least 
98% sequence identity and most preferably at least 98% idenity . Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 

30 sequence has at least about 65% identity, more preferably at least about 75% identity, more 
preferably at least about 80% identity, more preferably at least about 85% identity, more 
preferably at least about 90% identity, and most preferably at least about 95% identity, more 
preferably at least 98% and most preferably at least about 99% identity. For the puiposes of the 
present invention, sequences having substantially equivalent biological activity and substantially 

35 equivalent expression characteristics are considered substantially equivalent. For the purposes of 
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determining equivalence, truncation of the mature sequence (e.g., via a mutation which creates a 
spurious stop codon) should be disregarded. Sequence identity may be determined, e.g., using 
the Jotun Hein method (Hein, J. (1990) Methods Enzymol. 183:626-645). Identity between 
sequences can also be determined by other methods known in the art, e.g. by varying 
5 hybridization conditions. 

The term "totipotent'* refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 
DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
10 term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 
which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
1 5 using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake of a linked 
20 marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 

4.2 NUCLEIC ACIDS OF THE INVENTION 

25 Nucleotide sequences of the invention are set forth in the Sequence Listing. 

The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954; a 
polynucleotide encoding any one of the peptide sequences of SEQ ID NO: 985-1968, 2953-3936, 
3943-3948 or 3955-3960; and a polynucleotide comprising the nucleotide sequence encoding the 

30 mature protein coding sequence of the polypeptides of any one of SEQ DO NO: 985-1968, 2953- 
3936, 3943-3948 or 3955-3960. The polynucleotides of the present invention also include, but 
are not limited to, a polynucleotide that hybridizes under stringent conditions to (a) the 
complement of any of the nucleotides sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 
or 3949-3954; (b) nucleotide sequences encoding any one of the amino acid sequences set forth 

35 in the Sequence Listing as SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960; (c) a 
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polynucleotide which is an allelic variant of any polynucleotide recited above; (d) a 
polynucleotide which encodes a species homolog of any of the proteins recited above; or (e) a 
polynucleotide that encodes a polypeptide comprising a specific domain or truncation of the 
polypeptides of SEQ ID NO:985-1968, 2953-3936, 3943-3948 or 3955-3960. Domains of 
5 interest may depend on the nature of the encoded polypeptide; e.g., domains in receptor-like 
polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic domains, or 
combinations thereof; domains in immunoglobulin-like proteins include the variable 
immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
10 domains. 

The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

1 5 The present invention also provides genes corresponding to the cDN A sequences disclosed 

herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5 1 and 3* sequence can 

20 be obtained using methods known in the art For example, full length cDNA or genomic DNA that 
corresponds to any of the polynucleotides of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949- 
3954 can be obtained by screening appropriate cDNA or genomic DNA libraries under suitable 
hybridizationconditionsusinganyofthepolynucleotidesofSEQIDNO: 1-984, 1969-2952, 3937- 
3942 or 3 949-3954 or a portion thereof as a probe. Alternatively, the polynucleotides of SEQ ID 

25 NO: 1 -984, 1969-2952, 3937-3942 or 3949-3954 may be used as the basis for suitable primer(s) 
that allow identification and/or amplification of genes in appropriate genomic DNA or cDNA 
libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDNA and genomic sequences) obtained from one or more public databases, such as 
30 dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 
representative fragment or segment information, or novel segment information for the full-length 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
35 according to the invention can have, e.g. , at least about 65%, at least about 70%, at least about 
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75%, at least about 80%, 81%, 82%, 83%, 84%, more typically at least about 85%, 86%, 87%, 
88%, 89%, and more typically at least about 90%, 91%, 92%, 93%, 94%, and even more 
typically at least about 95%, 96%, 97%, 98%, 99%, sequence identity to a polynucleotide recited 
above. 

5 Included within the scope of the nucleic acid sequences of the invention are nucleic acid 

sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, or complements thereof, which 
fragment is greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater 
than 9 nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 

10 20 nucleotides or more that are selective for (i.e. specifically hybridize to any one of the 

polynucleotides of the invention) are contemplated. Probes capable of specifically hybridizing to 
a polynucleotide can differentiate polynucleotide sequences of the invention from other 
polynucleotide sequences in the same family of genes or can differentiate human genes from 
genes of other species, and are preferably based on unique nucleotide sequences. 

1 5 The sequences falling within the scope of the present invention are not limited to these 

specific sequences, but also include allelic and species variations thereof. Allelic and species 
variations can be routinely determined by comparing the sequence provided SEQ ID NO: 1-984, 
1969-2952, 3937-3942 or 3949-3954, a representative fragment thereof, or a nucleotide sequence at 
least 90% identical, preferably 95% identical, to SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 

20 3949-3954 with a sequence from another isolate of the same species. Furthermore, to accommodate 
codon variability, the invention includes nucleic acid molecules coding for the same amino acid 
sequences as do the specific ORFs disclosed herein. In other words, in the coding region of an 
ORF, substitution of one codon for another codon that encodes the same amino acid is expressly 
contemplated. 

25 The nearest neighbor or homology result for the nucleic acids of the present invention, 

including SEQ ED NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, can be obtained by searching a 
database using an algorithm or a program. Preferably, a BLAST which stands for Basic Local 
Alignment Search Tool is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 
36 290-300 (1993) and Altschul SJF. et al. J. Mol. Biol. 21 :403-410 (1990)). Alternatively a 

30 FASTA version 3 search against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 
suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 
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The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 
polynucleotides. 

5 The nucleic acid sequences of the invention are further directed to sequences which 

encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 

10 encoding the amino acid sequence variants are preferably constructed by mutating the 

polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 
will typically be modified in series, e.g., by substituting first with conservative choices (e.g. , 

1 5 hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 

choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 
may be made at the target site. Amino acid sequence deletions generally range from about 1 to 
30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 
insertions include amino- and/or carboxyl -terminal fusions ranging in length from one to one 

20 hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 
residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 
preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 
sequences necessary for secretion or for intracellular targeting in different host cells and 
sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 

25 In a preferred method, polynucleotides encoding the novel amino acid sequences are 

changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 
site of being changed. In general, the techniques of site-directed mutagenesis are well known to 

30 those of skill in the art and this technique is exemplified by publications such as, Edelman et al., 
DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 
When small amounts of template DNA are used as starting material, primer(s) that differs 

35 slightly in sequence from the corresponding region in the template DNA can generate the desired 
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amino acid variant. PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 
gives a polynucleotide encoding the desired amino acid variant. 
5 A further technique for generating amino acid variants is the cassette mutagenesis 

technique described in Wells et al., Gene 34:3 15 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al., supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 
code, other DNA sequences which encode substantially the same or a functionally equivalent 

10 amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 

1 5 domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 

20 conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SEQ ID NO: 1-984, 1969-2952, 3937- 
3942 or 3949-3954, or functional equivalents thereof, may be used to generate recombinant 
DNA molecules that direct the expression of that nucleic acid, or a functional equivalent thereof, 

25 in appropriate host cells. Also included are the cDNA inserts of any of the clones identified 
herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 

30 nucleotide sequences for joining to polynucleotides include, an assortment of vectors, e.g., 

plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 

35 selectable marker for the host cell. Vectors according to the invention include expression 



WO 01/57190 PCT/US01/04098 
vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
5 having any of the nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949- 
3954or a fragment thereof or any other polynucleotides of the invention. In one embodiment, the 
recombinant constructs of the present invention comprise a vector, such as a plasmid or viral 
vector, into which a nucleic acid having any of the nucleotide sequences of SEQ ID NO: 1-984, 
1969-2952, 3937-3942 or 3949-3954 or a fragment thereof is inserted, in a forward or reverse 

10 orientation. In the case of a vector comprising one of the ORFs of the present invention, the 

vector may further comprise regulatory sequences, including for example, a promoter, operably 
linked to the ORF. Large numbers of suitable vectors and promoters are known to those of skill 
in the art and are commercially available for generating the recombinant constructs of the present 
invention. The following vectors are provided by way of example. Bacterial: pBs, phagescript, 

15 PsiX174, pBluescript SK, pBs KS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene); pTrc99A, 
pKK223-3,pKK233-3,pDR540,pRIT5 (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, 
PXTI, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia). 

The isolated polynucleotide of the invention may be operably linked to an expression 
control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et al, 

20 Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. Many 
suitable expression control sequences are known in the art. General methods of expressing 
recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 
Enzymology 185, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 

25 or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 
transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 

30 lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 

kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 
Generally, recombinant expression vectors will include origins of replication and selectable 
markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. colt- 

35 and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct 
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transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 
phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
5 preferably, a leader sequence capable of directing secretion of translated protein into the 

periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 
characteristics, e.g., stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors for bacterial use are constructed by inserting a structural DNA 

10 sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 
vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
transformation include E. coli, Bacillus subtilis, Salmonella typhimitrium and various species 

1 5 within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 

20 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 
sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 

25 appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

Polynucleotides of the invention can also be used to induce immune responses. For 
example, as described in Fan et aL, Nat. Biotech. 17:870-872 (1999), incorporated herein by 

30 reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 
sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 
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4.3 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949^3954, or fragments, analogs or 
5 derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is 

complementary to a "sense" nucleic acid encoding a protein, e.g., complementary to the coding 
strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. In 
specific aspects, antisense nucleic acid molecules are provided that comprise a sequence 
complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an entire coding 

10 strand, or to only a portion thereof. Nucleic acid molecules encoding fragments, homologs, 

derivatives and analogs of a protein of any of SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 
3955-3960 or antisense nucleic acids complementary to a nucleic acid sequence of SEQ ID NO: 
1-984, 1969-2952, 3937-3942 or 3949-3954 are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 

15 of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5' and 3' sequences which flank the coding region that are not 

20 translated into amino acids (i.e., also referred to as 5 1 and 3 1 untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ ID 
NO: 1-984, 1969-2952, 3937-3942 or 3949-3954), antisense nucleic acids of the invention can be 
designed according to the rules of Watson and Crick or Hoogsteen base pairing. The antisense 
nucleic acid molecule can be complementary to the entire coding region of a mRNA, but more 

25 preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding 
region of a mRNA. For example, the antisense oligonucleotide can be complementary to the 
region surrounding the translation start site of a mRNA. An antisense oligonucleotide can be, for 
example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic 
acid of the invention can be constructed using chemical synthesis or enzymatic ligation reactions 

30 using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense 
oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or 
variously modified nucleotides designed to increase the biological stability of the molecules or to 
increase the physical stability of the duplex formed between the antisense and sense nucleic 
acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. 
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Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, 
5 inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5 -methy laminomethy luracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosylqueosine, S'-methoxycarboxymethyluracil, 5-methoxyuracil, 

2- methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
10 queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methy luracil, 

uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (i.e. 9 RNA transcribed from the 

1 5 inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 

20 protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 

conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 
an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 
nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 

25 antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 
receptors or antigens. Hie antisense nucleic acid molecules can also be delivered to cells using 

30 the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol III promoter are preferred. 

« 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 
35 double-stranded hybrids with complementary RNA in which, contrary to the usual p-units, the 
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strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids Res 15: 6625-6641). The 
antisense nucleic acid molecule can also comprise a 2 , -o-methylribonucleotide (Inoue et al. 
(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al (1987) 
FEBS Lett 215: 327-330). 

5 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
•single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 

10 Thus, ribo2ymes {e.g., hammerhead ribozymes (described in HaselhofF and Gerlach (1988) 

Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein {i.e., SEQ ID NO: 1- 
984, 1969-2952, 3937-3942 or 3949-3954). For example, a derivative of a Tetrahymena L-19 

15 IVS RNA can be constructed in which the nucleotide sequence of the active site is 

complementary to the nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., 
Cech et al. U.S. Pat. No. 4,987,071; and Cech et al U.S. Pat. No. 5,1 16,742. Alternatively, 
SECX mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from 
a pool of RNA molecules. See, e.g., Bartel et al, (1993) Science 261:141 1-1418. 

20 Alternatively, gene expression can be inhibited by targeting nucleotide sequences 

complementary to the regulatory region (e.g. , promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 
Anticancer Drug Des. 6: 569-84; Helene. et at. (1992) Ann. N. Y. Acad. Set. 660:27-36; and 
Maher (1992) Bioassays 14: 807-15. 

25 In various embodiments, the nucleic acids of the invention can be modified at the base 

moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al (1996) BioorgMed 
Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 

30 mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al (1996) above; 

35 Perry-OKeefe et al (1996) PNAS 93: 14670-675. 
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PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 
PNAs of the invention can also be used, e.g. , in the analysis of single base pair mutations in a 
5 gene by, e.g. , PNA directed PCR clamping; as artificial restriction enzymes when used in 
combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 
primers for DNA sequence and hybridization (Hyrup et ah (1996), above; Perry-O'Keefe (1996), 
above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance their 

1 0 stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 

1 5 portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 
using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 
the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 
can be performed as described in Hyrup (1996) above and Finn et ah (1996) Nucl Acids Res 24: 
3357-63. For example, a DNA chain can be synthesized on a solid support using standard 

20 phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 

5 , -(4-methoxytrityl)amino-5 , -deoxy-thymidine phosphoramidite, can be used between the PNA 
and the 5 1 end of DNA (Mag et ah (1989) Nucl Acid Res 17 : 5973-88). PNA monomers are then 
coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3 f 
DNA segment (Finn et ah (1996) above). Alternatively, chimeric molecules can be synthesized 

25 with a 5' DNA segment and a 3' PNA segment See, Petersen et ah (1975) Bioorg Med Chem 
Lett Si 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides {e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 
cell membrane (see, e.g., Letsinger et ah, 1989, Proc. Natl Acad Set U.S.A. 86:6553-6556; 
30 Lemaitre et ah, 1 987, Proc. Natl Acad Sci. 84:648-652; PCT Publication No. W088/098 1 0) or 
the blood-brain barrier (see, e.g. , PCT Publication No. W089/ 1 0 1 34). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
ah, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 
5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 
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peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 



4.5 HOSTS 

5 The present invention further provides host cells genetically engineered to contain the 

polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 

1 0 with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 

1 5 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 

20 DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 

encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

25 The host cell can be a higher eukaxyotic host cell, such as a mammalian cell, a lower 

eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaiyotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 

■ 

calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 

L. et al., Basic Methods in Molecular Biology (1 986)). The host cells containing one of the 
30 polynucleotides of the invention, can be used in conventional manners to produce the gene 

product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 

heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present 

invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 
35 COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. 
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The most preferred cells are those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
5 RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et 
ah, in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 

10 protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 

15 from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 

HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, 

20 SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 

25 Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 
30 or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 

Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 
35 may be necessary to modify the protein produced therein, for example by phosphorylation or 
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glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
5 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 

10 negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 

1 5 protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the * 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 

20 of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 

Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
sequences are added. In all cases, the identification of the targeting event may be facilitated by 

25 the use of one or more selectable marker genes that are contiguous with the targeting DNA, 

allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the negatively 

30 selectable marker flanks the targeting sequence, and such that a correct homologous 

recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
phosphoribosyl-transferase (gpt) gene. 
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The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et al,; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No.. 
5 PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by reference 
herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 

10 comprising: the amino acid sequences set forth as any one of SEQ ID NO: 985-1968, 2953-3936, 
3943-3948 or 3955-3960 or an amino acid sequence encoded by any one of the nucleotide 
sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 or the corresponding full 
length or mature protein. Polypeptides of the invention also include polypeptides preferably with 
biological or immunological activity that are encoded by: (a) a polynucleotide having any one of 

15 the nucleotide sequences set forth in SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 or 
(b) polynucleotides encoding any one of the amino acid sequences set forth as SEQ ID NO: 985- 
1968, 2953-3936, 3943-3948 or 3955-3960 or (c) polynucleotides that hybridize to the 
complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 
The invention also provides biologically active or immunologically active variants of any of the 

20 amino acid sequences set forth as SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960 
or the corresponding ftill length or mature protein; and "substantial equivalents" thereof (e.g., at 
least about 65%, at least about 70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 
84%, more typically at least about 85%, 86%, 87%, 88%, 89%, and more typically at least about 
90%, 91%, 92%, 93%, 94%, and even more typically at least about 95%, 96%, 97%, 98%, 99%, 

25 sequence identity that retain biological activity. Polypeptides encoded by allelic variants may 
have a similar, increased, or decreased activity compared to polypeptides comprising SEQ ID 
NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 

30 be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 
Chem. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 
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The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 
sequence is identified in the sequence listing by translation of the disclosed nucleotide 
sequences. The mature form of such protein may be obtained by expression of a full-length 

■ 

5 polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 
provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

10 Protein compositions of the preseat invention may further comprise an acceptable carrier, 

such as a hydrophilic, e.g.> pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 
fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 

1 5 nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 

20 sequence can be synthesized using commercially available peptide synthesizers. The 

synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 
structural and/or conformational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of larger polypeptides. Fragments are useful, for 

25 example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 
therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 

30 cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 
which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 

35 or proteins of the present invention. 
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The invention also relates to methods for producing a polypeptide comprising growing a 
culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 

5 expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 
culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

10 In an alternative method, the polypeptide or protein is purified from bacterial cells which 

naturally produce the polypeptide or protein. One skilled in the art can readily follow known 
methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immuno chromato graphy 3 HPLC, size-exclusion chromatography, ion-exchange chromatography, 

15 and immuno-affinity chromatography. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer- Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 
Manual; Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 

20 domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
25 activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
30 cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 

specificity of the binding molecule for SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955- 
3960. 

* 

The protein of the invention may also be expressed as a product of transgenic animals, 
e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
35 by somatic or germ cells containing a nucleotide sequence encoding the protein. 
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The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 
deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
5 sequences may include the alteration, substitution, replacement, insertion or deletion of a 

selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,51 8,584). Preferably, such 

10 alteration, substitution, replacement, insertion or deletion retains the desired activity of the 

protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 
systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 

15 importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 
retain protein activity in whole or in part and are useful for screening or other immunological 
methodologies may also be easily made by those skilled in the art given the disclosures herein. 

20 Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
an insect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S. A. 

25 (the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 
invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 

30 culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 

35 heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
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hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
5 maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and Invitrogen, 
respectively. Hie protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 

10 available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, eg., silica gel having pendant methyl or other 
aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 

1 5 homogeneous isolated recombinant protein. The protein thus purified is substantially free of 

other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 

The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. . 

20 Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 

modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 

25 provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 

antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 
as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 

30 steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 
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Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 
programs including, but are not limited to, the GCG program package, including GAP 
(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
5 University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 
Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 

1 0 (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1 998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
1 05-3 1 (1 982), incorporated herein by reference). The BLAST programs are publicly available 
from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 

15 Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 

protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
another polypeptide. Within a fusion protein the polypeptide according to the invention can 

20 correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically active 
portions of a protein according to the invention. Within the fusion protein, the term "operatively 
linked" is intended to indicate that the polypeptide according to the invention and the other 

25 polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or 
C-tenninus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 
the invention operably linked to the extracellular domain of a second protein. 
In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
30 sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 
the polypeptide sequences according to the invention comprise one or more domains fused to 
sequences derived from a member of the immunoglobulin protein family. The immunoglobulin 
35 fusion proteins of the invention can be incorporated into pharmaceutical compositions and 
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administered to a subject to inhibit an interaction between a ligand and a protein of the invention 
on the surface of a cell, to thereby suppress signal transduction in vivo. The immunoglobulin 
fusion proteins can be used to affect the bioavailability of a cognate ligand. Inhibition of the 
ligand/protein interaction may be useful therapeutically for both the treatment of proliferative 
5 and difierentiative disorders, e,g., cancer as well as modulating (e.g., promoting or inhibiting) 
cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be used as 
immunogens to produce antibodies in a subject, to purify ligands, and in screening assays to 
identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 

1 0 DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 
appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 

15 be synthesized by conventional techniques including automated DNA synthesizers. 

Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 
subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John Wiley & 

20 Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
in-frame to the protein of the invention. 

25 4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 
activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 

30 appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 
Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 

35 Amencan: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
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the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 
artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
5 activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 
states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

1 0 Other methods inhibiting expression of a protein include the introduction of antisense 

molecules to the nucleic acids of the present invention, their complements, or their translated KNA 
sequences, by methods known in the art Further, the polypeptides of the present invention can be 
inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

1 5 The present invention still further provides cells genetically engineered in vivo to express the 

polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

20 Knowledge of DNA sequences provided by the invention allows for modification of cells to 

. permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 

■ 

25 operatively linked to the desired protein encoding sequences. See, for example, PCT International 
PublicationNo. WO 94/12650, PCT International PublicationNo. WO 92/20808, and PCT 
International Publication No. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 

30 intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 
co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 

35 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
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be replaced by homologous recombination. As described herein, gene targeting can be used to 
replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachmentregions, negative 
5 regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 
protein produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 

1 0 which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 
under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 

1 5 targeting event may replace an existing element; for example, a tissue-specific enhancer can be 

♦ • 

replaced by an enhancer that has broader or different cell-type specificity than the naturally 
occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 

20 of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 

25 not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 
xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; 

30 U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. PCT/US92/09627 
(WO93/09222)by Seldenet al.; and International ApplicationNo. PCT/US90/06436 
(W09 1/06667) by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 



4.9 TRANSGENIC ANIMALS 
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In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
5 control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 

10 processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 

1 5 polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 
replacing the homologous promoter to provide for increased protein expression. The homologous 
promoter can be supplemented by insertion of one or more heterologous enhancer elements 

20 known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 

25 polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the genn line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 

30 control of exogenous or endogenous promoter elements, are known as transgenic animals.- 

Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 

35 processes, and preferably in disease states. Transgenic animals are useful as model systems to 
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identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 
5 invention promoter is either activated or inactivated to alter the level of expression of the 

polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 
homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
10 confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

The polynucleotides and proteins of the present invention are expected to exhibit one or 
more of the uses or biological activities (including those associated with assays cited herein) 

15 identified herein. Uses or activities described for proteins of the present invention may be 

provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 
mechanism underlying the particular condition or pathology will dictate whether the 
polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 

20 inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, 'therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 

■ 

(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 

♦ 

25 target gene products, either at the level of target gene/protein expression or target protein 

activity. Such modulators include polypeptides, analogs, (variants), including fragments and • 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 
indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 

30 helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 
or in one of the other physiological pathways described herein. 

35 4.10.1 RESEARCH USES AND UTILITIES 
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The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 
protein for analysis, characterization or therapeutic use; as markers for tissues in which the 
corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
5 tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 
disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 

10 sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 
an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 
polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 

1 5 example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et aL, Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 
the binding interaction. 

The polypeptides provided by the present invention can similarly be used in assays to 

20 determine biological activity, including in a panel of multiple proteins for high-throughput 

screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 

25 development or in a disease state); and, of course, to isolate correlative receptors or ligands. 

* 

Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent grade or 
kit format for commercialization as research products. 
30 Methods for performing the uses listed above are well known to those skilled in the art. 

References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 
and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

35 
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4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
sources or supplements. Such uses include without limitation use as a protein or amino acid 
supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
5 such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 

particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

10 

4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

■ 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 

15 activity or may induce production of other cytokines in certain cell populations. A 

polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 
or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
confirmation of cytokine activity. The activity of therapeutic compositions of the present 

20 invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1, BaF3, 
MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, Tl 165, HT2, CTLL2, TF-1, Mo7e, CMK, 
HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 
Assays for T-cell or thymocyte proliferation include without limitation those described 

25 in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3,19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Bertagnolli et al., J. Immunol. 
145:1706-1712, 1990; Bertagnolli et al., Cellular Immunology 133:327-341, 1991; Bertagnolli, 

30 et al., I. Immunol. 149:3778-3783, 1992; Bowman et al., I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
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and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
5 and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 

Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
deVries et al., J. Exp. Med. 173:1205-121 1, 1991; Moreau et al., Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 
and human interleukin 6— Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 

10 1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl Aced. Sci. 

U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 1 1 -Bennett, F., Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin ' 
9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 

15 J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 
Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 

20 Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095, 
1980; Weinberger et al., Eur. J. Immun. 1 1:405-41 1, 1981; Takai et al., J. Immunol. 
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 

25 

4.10,4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 
cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/of 

30 germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 
large quantities of human cells has important working applications for the production of human 

35 proteins which currently must be obtained from non-human sources or donors, implantation of 
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cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 
for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

5 It is contemplated that multiple different exogenous growth factors and/or cytokines may 

be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Flt- 
3L), any of the interleukins, recombinant soluble IL-6 receptor fused to DL-6, macrophage 

1 0 inflammatory protein 1-alpha (MIP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 
these cells in culture will facilitate the production of large quantities of mature cells. Techniques 

1 5 for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 
with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 

20 layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
autocrine expression of the polypeptide of the invention. This will allow for generation of 

25 undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable ceD lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 
identification of differentially expressed genes in stem cell populations that regulate stem cell 

30 proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 
used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 

35 genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
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of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 
the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
5 to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 
cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

10 promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. Invest, 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 
Academic Press (1997)). Alternatively, directed differentiation of stem cells can be 

15 accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 
exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 

20 sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
proliferation is determined by colony formation on semi-solid support e.g. as described by 

25 Bernstein et al., Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 
and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

30 biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy 
to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the 

35 growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
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traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 
treat consequent myelo-suppression; in supporting the growth and proliferation of 
megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 
various platelet disorders such as thrombocytopenia, and generally for use in place of or 

5 complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 
paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 

10 post irradiation/chemotherapy, either in-vivo or ex-vrvo (Le., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

Suitable assays for proliferation and differentiation of various hematopoietic lines are 

1 5 cited above. 

* 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et ah, Molecular 
and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993. 

20 Assays for stem cell survival and differentiation (which will identify, among others, 

proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et al., 
Proc. Natl. Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming cells 

25 with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, 
Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 

30 stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I. 

Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

35 4.10.6 TISSUE GROWTH ACTIVITY 
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A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 

w 

repair and replacement, and in healing of bums, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone growth in 
5 circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 
artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 

10 of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is 
useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-fonning cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 

15 periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 

Another category of tissue regeneration activity that may involve the polypeptide of the 

20 present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 

other tissue formation in circumstances where such tissue is not normally formed, has application 
in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 
humans and other animals. Such a preparation employing a tendon/ligament-like tissue inducing 
protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 

25 use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing 

defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation induced by 
a composition of the present invention contributes to the repair of congenital, trauma induced, or 
other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 

30 provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 
tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 

35 an appropriate matrix and/or sequestering agent as a carrier as is well known in the art. 
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The compositions of the present invention may also be useful for proliferation of neural 
cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 
nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 
5 composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 
lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 
accordance with the present invention include mechanical and traumatic disorders, such as spinal 
10 cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 
composition of the invention. 

Compositions of the invention may also be useful to promote better or fester closure of 
non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
1 5 insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 
kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
20 desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 
regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 
25 A composition of the present invention may also be useful for promoting or inhibiting 

differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 

Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 
30 International Patent Publication No. W095/1 6035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 

Assays for wound healing activity include, without limitation, those described in: Winter, 
Epidermal Wound Healing, pps. 71-1 12 (Maibach, H. I. and Rovee, D. T., eds.), Year Book 
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Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84(1978). 



4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 
5 A polypeptide of the present invention may also exhibit immune stimulating or immune 

suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 
severe combined immunodeficiency (SCED)), e.g., in regulating (up or down) growth and 

1 0 proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 

and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HTV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 
treatable using a protein of the present invention, including infections by HTV, hepatitis viruses, 

15 herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 

Autoimmune disorders which may be treated using a protein of the present invention 
include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 

20 rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 

autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 

25 venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens-Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
(particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 

30 suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et bL, Toxicology 125: 59-66, 
1998), skin prick test (Hoffmann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization 
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test (Vohr et al., Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al., 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
5 immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 
10 in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 

demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 

« 

of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 

1 5 limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 

20 followed by an immune reaction that destroys the transplant The administration of a therapeutic 
composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 

25 of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 

30 rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
al., Science 257:789-792 (1992) and Turka et al., Proc. Natl. Acad. Sci USA, 89:1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 

35 compositions of the invention on the development of that disease. 
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Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
5 reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process- Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 

1 0 autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases- Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 

15 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immu ne response. For example, enhancing an immune response may be useful in cases of viral 
20 infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 

Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 

■ 

25 patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

30 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

35 MHC class I alpha chain protein and ffe microglobulin protein or an MHC class II alpha chain 
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protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 
proteins on the cell surface. Expression of the appropriate class I or class H MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 
5 an antisense construct which blocks expression of an MHC class II associated protein, such as 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 
1 0 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 

15 Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al. s Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61:1992-1998; Bertagnolli et al., 

20 Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994. 

Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 

♦ 

Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
25 Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 

in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 

i 

30 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 

35 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
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et al., J. Immunol. 134:536-544, 1995; Inaba et ah, Journal of Experimental Medicine 
173:549-559, 1991; Macatoniaet al., Journal of Immunology 154:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 182:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of 
5 Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 

10 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 

15 include, without limitation, those described in: Antica et al., Blood 84:1 1 1-117, 1994; Fine et al., 
Cellular Immunology 155:1 1 1-122, 1994; Galy et al., Blood 85:2770-2778, 1995; Toki et al., 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 



4.10.8 ACTIVIN/INHIBIN ACTIVITY 

20 A polypeptide of the present invention may also exhibit activin- or inhibin-related 

activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 

25 alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 
homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 

30 a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 
animals such as, but not limited to, cows, sheep and pigs. 
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The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et al., Nature 
5 321 :776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad. ScL 
USA 83:3091-3095, 1986. 

4.10,9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 
10 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
1 5 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
20 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 

♦ 

Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 

■ 

Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 

25 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 

30 M. Kruisbeek, D. H. Mar guiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

35 
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4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 
5 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefiom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 
10 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et aL, J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

15 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 

20 example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

25 Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 

inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 

30 cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ducted carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 

35 associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
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bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 
5 nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 

10 administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thennotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

1 5 The composition can also be administered in therapeutically effective amounts as a 

portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a phaimaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 

20 with the polypeptide or modulator of the invention include: Actinomycin D, Aminogiutethimide, 
Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
Daunorubicin HC1, Doxorubicin HC1, Estramustine phosphate sodium, Etoposide (VI 6-2 13), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 

25 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Loraustine, Mechlorethamine HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 

30 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 

35 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 
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In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 
5 tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst, 52: 921-30 

(1974), mobility and invasive potential of tumor cells in Boy den Chamber assays as described in 
Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 1 89-97 (1999) and Li et al., 
1 0 Clin. Exp. Metastasis, 1 7:423-9 (1 999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 

4,10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
1 5 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without limitation, cellular adhesion molecules (such as seiectins, 
20 integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
25 interactions. 

■ 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
30 Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 
Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. 
Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168:1145-1156, 1988; 
Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 
175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 
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By way of example, the polypeptides of the invention may be used as a receptor for a 
Hgand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 
through binding assaysj affinity chromatography, dihybrid screening assays, Bl Acore assays, gel 
overlay assays, or other methods known in the art 

5 Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 

» ♦ 

partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 
present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 
colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 
Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 
10 Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
carbon- 14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 
molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 
toxins include, but are not limited, to ricin. 

15 4.10.13 DRUG SCREENING 

This invention is particularly useful for screening chemical compounds by using the 
novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellular ly. One method of drug screening 

20 utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 

nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent being tested or examine the 

25 diminution in complex formation between the novel polypeptides and an appropriate cell line, 
which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

30 comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 

35 fiingi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
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screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
review, see Science 252:63-68 (1998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 
organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin. 
Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol, 9(3):205-23 (1 998); Hruby et aL, Curr Opin Chem Biol, 
1 (1):1 14-19 (1997); Dorner et al., BioorgMed Chem, 4(S):709-15 (1996) (alkylated dipepudes). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 

4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 
to identify polynucleotides encoding binding partners. As another example, affinity 
chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (i.e., increase or decrease) biological activity of a polypeptide of the invention. 
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Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one cell population expresses the receptor of the 
invention whereas the other does not. Hie response of the two cell populations to the addition of 
5 ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 
polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including, (1) organic and 
inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

1 0 comprised of random peptides, oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade of the 
polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is then, incubated 

1 5 with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 
the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 

20 4.10 .15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
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inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 
endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-1 . Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 
limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
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arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1 , 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 
intrauterine infections. 

4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic, promyelocytic, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 
infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 
tuberculosis, syphilis; 

59 



WO 01/57190 PCT/US01/04098 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 
sclerosis; 

5 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 
10 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

1 5 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 

20 system disorder may be selected by testing for biological activity in promoting the survival or 
differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

25 (iii) increased production of a neuron-associated molecule in culture or in vivo, e.g. , 

choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 

30 forth in Arakawa et al. (1990, J. Neurosci. 10:3507-351 5); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 
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assessing the physical manifestation of motor neuron disorder, e.g., weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 
well as other components of the nervous system, as well as disorders that selectively affect 
neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 
(Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 

■ 

effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 
subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or components); effecting behavioral characteristics, including, without 
limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 
as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 
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The demonstration of polymoiphisms makes possible the identification of such 
polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
5 response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

1 0 Polymorphisms can be identified in a variety of ways known in the art which all 

generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 
involving isolation or amplification of the DNA, and identifying the presence of the 
polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 

1 5 allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 

hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 
single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 
adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 

20 enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 
present invention can be used to detect polymorphisms. The array can comprise modified 
nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 

25 invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 

i 

30 4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at, 1983, 
Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 
35 Induction of the disease can be caused by a single injection, generally intradermally, of a 
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suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 
mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 
5 The procedure for testing the effects of the test compound would consist of intradermally 

injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
10 would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
1 5 other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
include, but are not limited to, those exemplified herein. 

4.11.1 EXAMPLE 

20 One embodiment of the invention is the administration of an effective amount of the 

polypeptides or other composition of the invention to individuals affected by a disease or 
disorder that can be modulated by regulating the peptides of the invention. While the mode of 
administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 

25 polypeptides or other composition of the invention will normally be determined by the 

prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
condition and response of the individual patient. Typically, the amount of polypeptide 
administered per dose will be in the range of about O.Ol^ig/kg to 100 mg/kg of body weight, with 
the preferred dose being about 0. lug/kg to 10 mg/kg of patient body weight For parenteral 

30 administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 

35 The preparation of such solutions is within the skill of the art. 
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4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
5 including without limitation from recombinant and non-recombinant sources and including 

antibodies and other binding partners of the polypeptides of the invention) may be administered 
to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 

10 fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutically acceptable" means a non-toxic material that does not interfere with the 
effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 

15 M-CSF, GM-CSF, TTSJF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-1 1, IL-12, 
IL-13, IL-1^, IL-15, IFN, TNFO, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 

» 

factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 

20 factor (PDGF), transforming growth factors (TGF-a and TGF-p), insulin-like growth factor 
(IGF), as well as cytokines described herein. 

The pharmaceutical composition may further contain other agents which either enhance 
the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 

25 composition to produce a synergistic effect with protein or other active ingredient of the 
invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 

30 hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 
IL-IRa, IL-1 Hyl, IL-1 Hy2, antd-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 
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As an alternative to being included in a pharmaceutical composition of the invention 
including a first protein, a second protein or a therapeutic agent may be concurrently 
administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
5 Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 
edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
1 0 amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 
combination, a therapeutically effective dose refers to combined amounts of the active 
ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

15 In practicing the method of treatment or use of the present invention, a therapeutically 

effective amount of protein or other active ingredient of the present invention is administered to 
a mammal having a condition to be treated. Protein or other active ingredient of the present 
invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokdnes or other 

20 hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 

■ 

administered either simultaneously with the cytokine(s) 9 lymphokine(s), other hematopoietic 
factor(s), thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 
25 active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factor(s), thrombolytic or anti-thrombotic factors. 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
30 intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 
intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
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ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

Alternately, one may administer the compound in a local rather than systemic manner, for 
example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
5 a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 
system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
1 0 afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
1 5 ranges for the polypeptides of the invention can be extrapolated from these dosages or from 

similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician to provide maximal therapeutic benefit. 

4.12.2 COMPOSITIONS/FORMULATIONS 

20 Pharmaceutical compositions for use in accordance with the present invention thus may 

be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
preparations which can be used pharmaceutically. These pharmaceutical compositions may be 
manufactured in a manner that is itself known, e.g., by means of conventional mixing, 

25 dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 

lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 

30 the pharmaceutical composition of the invention may additionally contain a solid carrier such as 
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 
other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 

35 soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
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pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 
When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
5 about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 
other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 

10 active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 
present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 

15 other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 
preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 

20 barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral administration, the compounds can be formulated readily by combining the 
active compounds with pharmaceutically acceptable carriers well known in the art. Such carriers 
enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 

25 liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 

30 preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 
may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. Fortius 

35 purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
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talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 
added to the tablets or dragee coatings for identification or to characterize different combinations 

* 

of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 
tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in 
an inhaler or insufflator may be formulated containing a powder mix of the compound and a 
suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 
injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 

* 

an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
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solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
5 glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 

10 sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 

15 polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 

20 co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 

25 known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
Various types of sustained-release materials have been established and are well known by those 

30 skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed 

The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 

35 or excipients. Examples of such carriers or excipients include but are not limited to calcium 
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carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 
polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutically compatible counter ions. Such pharmaceutically 
acceptable base addition salts are those salts which retain the biological effectiveness and 
5 properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 
monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 
the like. 

The pharmaceutical composition of the invention may be in the form of a complex of the 

10 protein(s) or other active ingredient(s) of present invention along with protein or peptide 

antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor CTCR) following 
presentation of the antigen by MHC proteins. MHC and structurally related proteins including 

1 5 those encoded by class I and class U MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 
well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 

20 pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 
micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 

25 lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
herein by reference. 

30 The amount of protein or other active ingredient of the present invention in the 

pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient. Initially, the 

35 attending physician will administer low doses of protein or other active ingredient of the present 
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invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
5 contain about 0.01 jig to about 100 mg (preferably about 0.1 |ig to about 10 mg, more preferably 
about 0.1 ng to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic 

10 composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 

1 5 described above, may alternatively or additionally, be administered simultaneously or 

sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active ingredient-containing composition to the site of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimally 

20 capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 
compositions will define the appropriate formulation. Potential matrices for the compositions 

25 may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 

hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 

30 aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 

35 glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 
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In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 
5 (including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 

hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 
poly(ethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and poly(vinyl alcohol). 

10 The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 
protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

15 compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-P), and 
insulin-like growth factor (IGF). 

20 The therapeutic compositions are also presently valuable for veterinary applications. 

Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 
regeneration will be determined by the attending physician considering various factors which 

25 modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and 
with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 

30 other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
35 polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
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mammalian subject Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
5 activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 



4.12.3 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 

10 intended purpose. More specifically, a therapeutically effective amount means an amount 

effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 

1 5 appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that includes the IC50 as determined in cell culture (i.e., the concentration of 
the test compound which achieves a half-maximal inhibition of the protein's biological activity). 

20 Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the 

25 population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD50 and ED50. Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
range of dosage for use in human. The dosage of such compounds lies preferably within a range 

30 of circulating concentrations that include the ED50 with little or no toxicity. The dosage may 

vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fingl et aL, 1975, in "The 
Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount and interval may be adjusted 

35 individually to provide plasma levels of the active moiety which are sufficient to maintain the 
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desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
bioassays can be used to determine plasma concentrations. 
5 Dosage intervals can also be determined using MEC value. Compounds should be 

administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 
10 An exemplary dosage regimen for polypeptides or other compositions of the invention 

will be in the range of about 0.01 |ig/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 ng/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
intervals. 

15 The amount of composition administered will, of course, be dependent on the subject 

being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

4.12.4 PACKAGING 

20 The compositions may, if desired, be presented in a pack or dispenser device which may 

contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 

25 appropriate container, and labeled for treatment of an indicated condition. 

4.13 ANTIBODIES 

Also included in the invention are antibodies to proteins, or fragments of proteins of the 
invention. The term "antibody" as used herein refers to immunoglobulin molecules and 

30 immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, Feb, Fab* and F(ab02 
fragments, and an F ab expression library. In general, an antibody molecule obtained from 
humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 

35 by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
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such as IgGi, IgG2, and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 
subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
5 portion or fragment thereof, and additionally can be used as an immunogen to generate 

antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 
invention provides antigenic peptide fragments of the antigen for use as immunogens. An 
antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid sequence 

10 of the full length protein, such as an amino acid sequence shown in SEQ ID NO:985, and 

encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 
Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 

1 5 epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 

20 indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 
may be generated by any method well known in the art, including, for example, the Kyte 
Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g. 9 

25 Hopp and Woods, 1981, Proc. Nat Acad Set USA 78: 3824-3828; Kyte and Doolittle 1982, J. 
Mol BioL 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 

30 thereof, may be utilized as an immunogen in the generation of antibodies that 

■ 

immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
monoclonal antibodies directed against a protein of the invention, or against derivatives, 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
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Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below 



5.13.1 Polyclonal Antibodies 

5 For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 

goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 

10 recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 
adjuvant. Various adjuvants used to increase the immunological response include, but are not 

15 limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 

20 synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 

25 target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

30 5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
35 antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
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binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature. 256:495 (1975). In a hybridoma method, a mouse, 
5 hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion 
protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin 

10 are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: 
Principles and Practice, Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually 
transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 

15 Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 
the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 

20 medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 

25 California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, T. TmmimnL T 133:3001 (1984); Brodeur et al., Monoclonal 
Antibody Production Techniques and Applications, Marcel Dekker, Inc., New York, (1987) pp. 
51-63). 

30 The culture medium in which the hybridoma cells are cultured can then be assayed for 

the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in the 

35 art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
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Scatehard analysis of Munson and Pollard, Anal. Biochem.. 107 :220 ( 1 980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by limiting 

5 dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 
medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 

10 example, protein A-Sepharose, hydroxy lapatite chromatography, gel electrophoresis, dialysis, or 
affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 
invention can be readily isolated and sequenced using conventional procedures (e.g., by using 

15 oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 
myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 

20 monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 

example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368 , 
812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 
coding sequence for a non-immimoglobulin polypeptide. Such a non-immunoglobulin 

25 polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5.13.2 Humanized Antibodies 

30 The antibodies directed against the protein antigens of the invention can further comprise 

humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', FQab% or other antdgen- 

35 binding subsequences of antibodies) that are principally comprised of the sequence of a human 

♦ 
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immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al., 
Nature, 321:522-525 (1986); Riechmann et al., Nature, 332:323-327 (1988); Verhoeyen et al., 
Science, 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 
5 conesponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
humanized antibody will comprise substantially all of at least one, and typically two, variable 

10 domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et al., 1986; Riechmann et al., 1 988; and Presta, Curr. On. Struct. BioL. 

15 2:593-596(1992)). 



5.13.3 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 

20 genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 

25 antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: 

Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 

30 including phage display libraries (Hoogenboom and Winter, J. MoL Biol.. 2Z7:381 (1991); 
Marks et al., J. Mol. Biol, 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in humans 

35 in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
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is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10. 779-783 (1992)); Lonberg et aL 
(Nature 368 856-859 (1994)); Morrison ( Nature 368, 812-13 (1994)); Fishwild et al,( Nature 
Biotechnology 14, 845-51 (1996)); Neuberger (Nature Biotechnology 14 r 826 (1996)); and 

5 Lonberg and Huszar (Intern. Rev . Timniinol 13 65-93 (1 995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 

1 0 have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The 

1 5 preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 
polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 

20 hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 

immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 

An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 

25 expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 

5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 

30 and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
U.S. Patent No. 5,916,771. It includes introducing an expression vector that contains a 
nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 

35 an expression vector containing a nucleotide sequence encoding a light chain into another 

80 



WO 01/57190 PCT/US01/04098 

mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 

In a further improvement on this procedure, a method for identifying a clinically relevant 
epitope on an immunogen, and a correlative method for selecting an antibody that binds 
5 immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

5.13.4 Fab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
10 antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of Fab expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 
monoclonal Fab fragments with the desired specificity for a protein or derivatives, fragments, 
analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
1 5 may be produced by techniques known in the art including, but not limited to: (i) an F( a b02 

fragment produced by pepsin digestion of an antibody molecule; (ii) an F 8 b fragment generated 
by reducing the disulfide bridges of an F(ab72 fragment; (iii) an F a b fragment generated by the 
treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

20 5.13.5 Bispecific Antibodies 

Bispecific antibodies-are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit. 

25 Methods for making bispecific antibodies are known in the art. Traditionally, the 

recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature . 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 

30 potential mixture of ten different antibody molecules, of which only one has the correct 

bispecific structure. The purification of the correct molecule is usually accomplished by affinity 
chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker et al, 1991 EMBQJ., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 

35 combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
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preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
the hinge, CH2, and CH3 regions. It is preferred to have the first heavy r chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
5 light chain, are inserted into separate expression vectors, and are cotransfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
al., Methods in Enzvmologv- 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 

1 0 recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 

15 acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 
F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 

20 prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 
fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 

25 derivatives is then reconverted to the Fab'-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 

30 coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab')2 molecule. Each Fab' fragment 
was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 

35 of human cytotoxic lymphocytes against human breast tumor targets. 
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Various techniques for making and isolating bispecific antibody fragments directly from 
recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelny et aL, J. Immunol. 148(5): 1547-1 553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
5 different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 
also be utilized for the production of antibody homodimers. The "diabody" technology 
described by Hollinger et aL, Proc. Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 

10 heavy-chain variable domain (Vh) connected to a light-chain variable domain (Vl) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 
the Vh and Vl domains of one fragment are forced to pair with the complementary Vl and Vh 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 

15 reported. See, Gruber et aL, J Immunol 1 52:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 
antibodies can be prepared. Tutt et aL, J. Immunol. 147:60 (1991). 
Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 

20 immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 
IgG (FcyR), such as FcyRI (CD64), FcyRH (CD32) and FcyRHI (CD 16) so as to focus cellular 
defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 

25 possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
binds the protein antigen described herein and further binds tissue factor (TF). 

5.13.6 Heteroconjugate Antibodies 

30 Heteroconjugate antibodies are also within the scope of the present invention. 

Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 
have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
. No. 4,676,980), and for treatment of HTV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 

35 protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
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can be constructed using a disulfide exchange reaction or by forming a thioetber bond. 
Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

5 5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 

10 internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176: 1191-1 195 (1992) 
and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 

15 has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

5*13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a . 
20 cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 

25 diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 

30 radionuclides are available for the production of radioconjugated antibodies. Examples include 
2,2 Bi, l31 I, ,31 In, 90 Y, and 186 Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 
protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 

35 active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
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compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 
bis-(p-diazoniumben2oyl>ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). 
5 Carbon- 14-labeled l-isothiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 
WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
10 administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
conjugated to a cytotoxic agent. 

4.14 COMPUTER READABLE SEQUENCES 

15 In one application of this embodiment, a nucleotide sequence of the present invention can 

be recorded on computer readable media. As used herein, "computer readable media" refers to 
any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 

20 and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 
be used to create a manufacture comprising computer readable medium having recorded thereon 
a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 

25 presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 

30 to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 
readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 

35 Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
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formats {e.g. text file or database) in order to obtain computer readable medium having recorded 
thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 
or 3949-3954 or a representative fragment thereof; or a nucleotide sequence at least 95% 
5 identical to any of the nucleotide sequences of SEQ ID NO: 1 -984, 1 969-2952, 3937-3942 or 
3949-3954 in computer readable form, a skilled artisan can routinely access the sequence 
information for a variety of purposes. Computer software is publicly available which allows a 
skilled artisan to access sequence information provided in a computer readable medium. The 
examples which follow demonstrate how software which implements the BLAST (Altschul et 

10 al. s J. Mol. Biol. 215:403-410 (1990)) and BLAZE (Brutlag et aL, Comp. Chem. 17:203-207 
(1993)) search algorithms on a Sybase system is used to identify open reading frames (ORFs) 
within a nucleic acid sequence. Such ORFs may be protein encoding fragments and may be 
useful in producing commercially important proteins such as enzymes used in fermentation 
reactions and in the production of commercially useful metabolites. 

15 As used herein, "a computer-based system" refers to the hardware means, software 

means, and data storage means used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 

20 computer-based systems are suitable for use in the present invention. As stated above, the 

computer-based systems of the present invention comprise a data storage means having stored 
therein a nucleotide sequence of the present invention and the necessary hardware means and 
software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 

25 invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are implemented 
on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 

30 fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 
available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 

35 skilled artisan can readily recognize that any one of the available algorithms or implementing 
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software packages for conducting homology searches can be adapted for use in the present 
computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
5 present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 
residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

10 As used herein, "a target structural motif," or "target motif," refers to any rationally 

selected sequence or combination of sequences in which the sequences) are chosen based on a 
three-dimensional configuration which. is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 

15 to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
20 control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA* 
Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 15241:456 (1988); and Dervan 
25 etaL, Science 25 1:1360 (1991)) or to the mRNA itself (antisense -Olmno, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
30 Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
35 one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
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acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
5 for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 

10 detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

15 In detail, such methods comprise incubating a test sample with one or more of the 

antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 

20 employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 

25 Amsterdam, The Netherlands (1 986); Bullock, G.R. et al., Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 

30 sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 
extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 
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In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 
provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
5 invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 
containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 

10 another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 

1 5 contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 

20 established kit formats which are well known in the art. 



4.17 MEDICAL IMAGING 

The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
25 invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 
pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

30 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO: 
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1-984, 1969-2952, 3937-3942 or 3949-3954, or bind to a specific domain of the polypeptide 
encoded by the nucleic acid. In detail, said method comprises the steps of: ' 

(a) contacting an agent with an isolated protein encoded by an ORF of the present 
invention, or nucleic acid of the invention; and 
5 (b) determining whether the agent binds to said protein or said nucleic acid. 

In general, therefore, such methods for identifying compounds that bind to a 
polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 
the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 

10 to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 
polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
invention for a time sufficient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 

1 5 polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 
comprise contacting a compound with a polypeptide of the invention in a cell for a time 
sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 
receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 

20 sequence expression, so that if a polypeptide/compound complex is detected, a compound that 
binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
activity observed in the absence of the compound). Alternatively, compounds identified via such 

25 methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity/expression. 

30 The agents screened in the above assay can be, but are not limited to, peptides, 

carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 
the like are selected at random and are assayed for their ability to bind to the protein encoded by 

35 the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
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As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 
readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
5 antipeptide peptides, for example see Hurby et aL, Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et aL, Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or EMFs 

10 of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 

15 by binding to DNA or KNA. Such agents can be based on the classic phosphodiester, 

ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 

20 Lee et al., Nucl. Acids Res, 6:3073 (1979); Cooney et aL, Science 241 :456 (1988); and Dervan et 
al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 

25 polypeptide. Both techniques have been demonstrated to be effective in model systems. 

Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention can 
be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 

30 present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 



4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
35 hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
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hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. Because the 
corresponding gene is only expressed in a limited number of tissues, a hybridization probe 
derived from of any of the nucleotide sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 
5 3949-3954 can be used as an indicator of the presence of RNA of cell type of such a tissue in a 
sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 
additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 

10 PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 

1 5 are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 
nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 

20 chromosome using well known genetic and/or chromosomal mapping techniques. These 

techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
chromosome spreads has been described, among other places, in Verma et al (1988) Human 

25 Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
of genetic map data can be found in the 1 994 Genome Issue of Science (265 : 1 98 1 f). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 

30 predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 
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4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 
5 Support bound oligonucleotides may be prepared by any of the methods known to those of 

skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1 990) J. Clin. Microbiol. 28(6) 1 469-72); 
using UV light (Nagata et aL, 1985;Dahlenetf a/., 1987; Momssey & Collins, (1989) Mol. Cell 

10 Probes3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et aL, 1988; 1989);all 
references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interactionas a linker. For example, Broude et aL (1994) Proc. Natl. Acad. Sci. USA 91(8) 3072-6, 
describe the use of biotinylated probes, although these are duplex probes, that are immobilized on 

1 5 streptavi din-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 
Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 
Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be used. Nunc 
20 Laboratories have developed a method by which DNA can be covalently bound to the microwell 
surface termed Covalink NH. CovaLinkNH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5-end by a phosphoramidate bond, allowing immobilization of more than 1 pmol of DNA 
25 (Rasmussene/ al t (1991) Anal. Biochem. 198(1) 138-42). 

The use of CovaLinkNH strips for covalent binding of DNA molecules at the 5*-end has 
been described (Rasmussen et aL, (1991). In this technology, a phosphoramidate bond is employed 
(Chu et aL, (1983) Nucleic Acids Res. 1 1(8) 65 13-29). This is beneficial as immobilizationusing 
only a single covalent bond is preferred. The phosphoramidate bond joins the DNA to the 
30 CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently 

grafted onto the polystyrene surface through a 2 nm long spacer aim. To link an oligonucleotide to 
CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidin used to bind the probes. 
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More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 
denaturing for 1 0 min. at 95°C and cooling on ice for 1 0 min. Ice-cold 0. 1 M 1 -methylimidazole, 
pH7.0(l-MeIm7), isthen added to a final concentration of 10 mM 1-Melm7. A ss DNA solution is 
then dispensed into CovaLinkNH strips (75 ul/well) standing on ice. 
5 Carbodiimide 0.2 M l-ethyl-3-(3-dimethylaminopropyl)-carbodiimide (EDC), dissolved in 

1 0 mM 1 -Melm7, is made fresh and 25 id added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

10 It is contemplated that a further suitable method for use with the present invention is that 

described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3-reagent through the phosphate group by a covalent phosphodiester link to aliphatic 
hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported 

1 5 nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 
conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 
nucleoside pho sphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotectionmay be 

20 employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 
Fodor et al. (1991) Science 25 1 (4995) 767-73, incorporated herein by reference. Probes may also 
be immobilized on nylon supports as described by Van Ness et al (1 99 1 ) Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) Anal. Biochem. 
169(1) 104-8; all references being specifically incorporated herein. 

25 To link an oligonucleotide to a nylon support, as described by Van Ness et al (1 99 1 ), 

requires activation of the nylon surface via alkylation and selective activation of the 5-amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
light-generated synthesis described by Pease et al., (1994) PNAS USA 91 (1 1) 5022-6, incorporated 

30 herein by reference). These authors used current photolithographic techniques to generate arrays of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 
5-protectediV^acyl-deoxynucleosidephosphoramichtes,siirface linker chemistry and versatile 
combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 

3 5 generated in this manner. 
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4.21 PREPARATION OF NUCLEIC ACED FRAGMENTS 
The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
including mRNA without any amplification steps. For example, Sambrook etal (1989) describes 
5 three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in M 1 3 , plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplification methods. Samples 
may be prepared or dispensed in muhiwell plates. About 1 00- 1 00 0 ng of DNA samples may be 
1 0 prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrook et 
al (1 989), shearing by ultrasound and NaOH treatment. 

Low pressure shearing is also appropriate, as described by Schriefer et al (1 990) Nucleic 
1 5 Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 
fragmentation methods. 

♦ 

20 One particularly suitable way for fragmenting DNA is contemplated to be that using the two 

base recognition endonuclease, CwJI, described by Fitzgerald et al (1 992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

25 The restriction endonuclease Cv/JI normally cleaves the recognition sequence PuGCPy 

between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 

* ♦ 

this enzyme (Cv/JI* *), yield a quasi-random distribution of DNA fragments form the small 
molecule pUC 19 (2688 base pairs). Fitzgerald et al (1992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a CvzJl* * digest of pUC 1 9 that was size 

3 0 fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus 
M13 cloning vector. Sequence analysis of 76 clones showed that CWJI** restricts pyGCPy and 
PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 

35 agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
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ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
5 achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
cfuickly to 2°C to prevent renaturationof the DNA fragments before they are contacted with the 
chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 

1 0 Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a DNA solution to a 
nylon membrane. By offset printing, a density of dots higher than the density of the wells is 
achieved. One to 25 dots may be accommodated in 1 mm , depending on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) 

1 5 may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 
subarrays may represent replica spotting of the same samples. In one example, a selected gene 
segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96-well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 

20 prepared. By using a 96-pin device, all samples may be spotted on one 8x12 cm membrane. 

Subarrays may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 

25 being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 
3 0 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 
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variations in the practice of the invention are expected to occur to those skilled in the art upon 
consideration of the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby incorporated by 
reference in their entirety. 

5.0 EXAMPLES 

5-1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 
A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 
using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 
inserts of the library were amplified with PCR using primers specific for the vector sequences which 
flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened 
with oligonucleotide probes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
into groups of similar or identical sequences. Representative clones were selected for sequencing. 

In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
(ABI) sequencerto obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDN A Ends) was performed to further extend the sequence in the 5 ' direction. 

5.2 EXAMPLE 2 

Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 1 969-295 1 , 
and 3949-3954 were assembled using an EST sequence as a seed. Then a recursive algorithm was 
used to extend the seed EST into an extended assemblage, by pulling additional sequences from 
different databases (i.e., Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 
114, and UniGene version 101) that belong to this assemblage. The algorithm terminated when 
there was no additional sequences from the above databases that would extend the assemblage. 
Inclusion of component sequences into the assemblage was based on a BLASTN hit to the 
extending assemblage with BLAST score greater than 300 and percent identity greater than 95%. 

Tables 6 and 8 sets forth the novel predicted polypeptides (including proteins) encoded by 
the novel polynucleotides (SEQ ID NO:2953-3936, and 3949-3954) of the present invention, and 
their corresponding nucleotide locations to each of SEQ ID NO: 2953-3936 and 3955-3960. Tables 
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6 and 8 also indicates the method by which the polypeptide was predicted Method A refers to a 
polypeptide obtained by using a software program called FASTY (available from 
http://fasta.bioch.virginia.edu) which selects a polypeptide based on a comparison of the translated 
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 1 83:63-98 
5 (1990), herein incorporated by reference). Method B refers to a polypeptide obtained by using a 
software program called GenScan for human/vertebrate sequences (available from Stanford 
University, Office of Technology Licensing) that predicts the polypeptide based on a probabilistic 
model of gene structure/compositionalproperties (C.'Burge and S. Karlin, J. Mol. Biol, 268:78-94 
(1 997), incorporated herein by reference). Method C refers to a polypeptide obtained by using a 
1 0 Hyseq proprietary software program that translates the novel polynucleotide and its complementary 
strand into six possible amino acid sequences (forward and reverse frames) and chooses the 
polypeptide with the longest open reading frame. 

* 

53 EXAMPLE 3 
Novel Nucleic Acids 

1 5 Using PHRAP (Univ. of Washington) or C AP4 (Paracel), full length gene cDNA sequences 

and their corresponding protein sequences were generated from the assemblage. Any frame shifts 
and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genebank. Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) and 

20 ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences are shown in the 
Sequence Listing as SEQ ID NO:l-35 1. The amino acids are SEQ ID NO:985-1335. 
Table 1 shows the various tissue sources of SEQ ID NO: 1-351. 

The nearest neighbor results for SEQ ED NO: 1-351 were obtained by a BLASTP version 
2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 release 
25 21 (Derwent), vising BLAST algorithm. The nearest neighbor result showed the closest 

homologue for SEQ ID NO: 1-351 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ED NO: 1-351 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
30 Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 

* 

examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 



98 



WO 01/57190 PCT/US01/04098 

■ 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
5 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 

10 Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication " 
Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 7 shows the position of the signal peptide in each of the polypeptides 

1 5 and the maximum score and mean score associated with that signal peptide. 

5.4 EXAMPLE 4 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
20 sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using F ASTY and/or BLAST against Genbank (i.e. dbEST version 1 1 7, gb pri 1 1 7, 
UniGene version 117, Genpept release 1 1 7). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
25 ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ED NOS: 352-766. The corresponding 
amino acids are SEQ ID NO: 1336-1750. 

Table 1 shows the various tissue sources of SEQ ID NO: 352-766. 
The nearest neighbor results for SEQ ID NO: 352-766 were obtained by a BLASTP 
30 version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 352-766 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs with 
identifiable functions for SEQ ID NO: 352-766 are shown in Table 2 below. 
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Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
5 the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 

1 0 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

1 5 disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 

20 each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.5 EXAMPLES 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
25 sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 8, gb pri. 118, 
UniGene version 118, Genpept release 1 1 8). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
30 ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 767-930. The corresponding 
amino acid sequences are SEQ ID NO: 175 1-1914, 

Table 1 shows the various tissue sources of SEQ ID NO: 767-930. 
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The homology results for SEQ ID NO: 767-930 were obtained by a BLASTP version 
2.0al 1 9MP- WashU search against Genpept release 120 and Geneseq October 12, 2000 release 
21(Derwent), using BLAST algorithm. The nearest neighbor result showed the homologs for 
SEQ ID NO: 767-930 from Genpept. The translated amino acid sequences for which the nucleic 
5 acid sequence encodes are shown in the Sequence Listing. The homologues with identifiable 
functions for SEQ ID NO: 767-930 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
10 signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
1 5 the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 

20 for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites*' Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

25 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.6 EXAMPLE 6 
Novel Nucleic Acids 

30 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 1 1 8, gb pri 1 18, 
UniGene version 118, Genpept release 1 1 8). Other computer programs which may have been used 
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in the editing process were phredPhrap and Consed (University of Washington) and ed-ready , ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 93 1-965. The corresponding 
amino acid sequences are shown in SEQ ID NO: 1 9 1 5-1 949. 
5 Table 1 showsthe various tissue sources of SEQ ID NO: 931-965. 

The nearest neighbor results for SEQ ID NO: 931-965 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 931-965 from Genpept . The translated amino acid sequences for 

1 0 which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 93 1-965 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 

1 5 signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 

20 the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI . 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 

25 for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
- cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

30 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.7 EXAMPLE 7 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 1 1 9, gb pri 1 19, 
5 UniGene version 119, Genpept release 119). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS:966-974. The corresponding 
amino acid sequences are SEQ ID NO:1950-1958. 

1 0 Table 1 shows the various tissue sources of SEQ ID NO: 966-974. 

The nearest neighbor results for SEQ ID NO: 966-974 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 966-974 from Genpept . The translated amino acid sequences for 

15 which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 966-974 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 

20 signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 

25 the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP V 1 . 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 

30 for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

35 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
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each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.8 EXAMPLE 8 
Novel Nucleic Acids 

5 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 120, gb pri 120, 
UniGene version 1 20, Genpept release 120). Other computer programs which may have been used 

10 in the editing process were phredPhrap and Consed (University of Washington) and ed-ready , ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS : 975-984 . The corresponding 
amino acid sequences are SEQ ID NO: 1 95 9- 1 968. 

Table 1 shows the various tissue sources of SEQ ID NO: 975-984. 

1 5 The nearest neighbor results for SEQ ID NO: 975-984 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 21, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 975-984 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 

20 with identifiable functions for SEQ ID NO: 975-984 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 

25 the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 

30 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
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disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication M Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
5 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5,9 EXAMPLE 9 
Novel Nucleic Acids 

10 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a Ml length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 120, gbpri 120, 
UniGene version 120, Genpept release 120). Other computer programs which may have been used 

15 in the editing process were phredPhrap and Consed (University of Washington) and ed-ready , ed- 
ext and gc-zip-2 (Hy seq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS:3937-3942. The 
correspondingpeptide sequence is SEQ ID NO: 3943-3948. 

Table 1 shows the various tissue sources of SEQ ID NO: 3937-3942. 

20 The nearest neighbor results for SEQ ID NO: 3937-3942 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 3937-3942 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 

25 with identifiable functions for SEQ ID NO: 3937-3942 are shown in Table 9 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-:235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 10 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 

30 the eMatrix p-value(s) and the position(s) of the signature within' the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 1 1 shows the name of 
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the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 
5 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
10 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 12 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

15 Tables 5 and 1 3 are correlation tables of all of the sequences and the SEQ ID NOS. 



TABLE 1 



Tissue Origin 


RNA 
Source 


Library 
Name 


SEQ ED NOS: 


lung 






3 1 1 25 49 65 75 1 14 141 156 160 172 
190 198 209 217 224 229 234-235 267 
269 274 277 282 284 303 308 312 320 
334 336 352 372 396 398 412 414 437 
453 464 470 481 492-494 508-509 532 
539 581 584 617-619 621 628 633 643 
688 691 745 752 761 768 794 822 837 
848 876 887 953 967 973 


adult brain 

* 


GIBCO 


AB3001 


1 3 12-13 16 22-24 28-29 41 48 58 65 78 
82 89-90 94 97 103 112 114-115 117 120 
122 130-131 168 181 184 186-187 189- 
190 198 208 216 247 249 259 270 277 
297 301 308 312 314 321 333 348 374 
396 403 406 410 412 416-417 420 423 
426-427 431 456 474 481 484-485 488 
498 500 508-509 530 549 553 558 563- 
564 583 596 602-603 608 612 621-622 
624 643 650 674 699 711 736 738-739 
753 770 779-780 785-786 802-803 816 
822 839 842 848 859 861 871 893-894 
897 900 903 925 954 958 967 969 


adult brain 


GEBCO 


ABD003 


3 19 21-25 28-29 31 33-34 37 39 41 46-48 
53 58 63-64 66 72 78 80 99 103 109-1 10 
112 114 118 120-124 126 132-133 135 
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139 143 146 148-149 159 163 168 174 

176 179-180 184-185 188-190 202 208- 

209 216-217 221 223 230 234-235 240 

244 249 251 253 255 258-259 263 269- 

270 277 282 285-286 290 294-295 297 

301-302 304-305 307-308 311-312 314 

320 329 333 335-336 342 344 346 349 

354 358 365 370 373-374 377 380 382- 

383 388 394-396 399 401-402 406 409- 

410 413 416 420-421 425 428 430-431 

436-437 442 456 462 464 466-467 474 

484 486 495-496 500-501 506 508-509 

519 530 537 542 549 561-562 564 572 

574 577-578 580-583 586-587 589 592- 

593 596-597 601 608 610 612-614 617- 
t*y& fnc\.f*\ r ) fn^ tm /;sn asr 

Oi*r \JJ\J m \}J£. QJJ QJ/ \JJ\J UJO OOJ"OOt 

668 676 679 681 689-690 693 699 724 
726 732 736 742-743 747 767-770 780 
784 789 793 799 802-805 813 817-818 
822 824 829-831 837 839 845 848 856 
859-860 864 871-872 875-876 881 887 
896-897 901 903 907 910-911 925 930 

7JJ J7^~J ^ I I 7JA -JJ "JO 7VJX— 7UJ 

965 967 972 977 


adult brain 


Clontech 


ABR001 


3 53 66 113 115 126 135 160 172 179 185 
204 263 273 305 312 323 358 380 383 
395-396 403 420 428-429 431 461 542 
583 586 606-607 61 1 620 645-646 688 
690 715 732 736 740 748 754 768 784- 
786 790 796 800 878 897 906-907 947 

977 


adult brain 


Clontech 


ABR006 


19 32 49 53 60 72 91 103 118 125 130- 
131 134 184 224 275 338 350 354 361- 
363 374 384 390 394 396 431-432 434- 
435 445 468 549 621 732 734-736 745 
760-761 764 768-769 775 787 806 81 1 
818 887 903 906 918 930 942 947 957 
973 977 


adult brain 


Clontech 


ABR008 

• 


2-3 9-1 1 14 17 21 23-25 28-29 31-35 37 
41-42 45 47-48 56-57 65-66 69-70 72 75 
77-78 88 91-92 97-99 101 103 1 12-1 15 
118-128 130-131 135 138-140 142 144- 
146 148 152 156-157 159-160 163 168 
172 174 176 178-180 182-190 194 196- 
198 200-201 204 209-214 218 220-225 
228-230 232-233 238-240 243-244 246 
254-256 260-264 270 272-274 278-279 
282-285 289-291 293-294 296-297 301 
303-306 312-314 317 321-322 325-328 
334 336 338 340-342 344 346 348 350- 
352 354 356-358 363 366 369-374 376 
379-381 383-386 388-394 398-399 402- 
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• 




4 


403 405 409-412 414 418-421 423-424 
426-427 430 433-437 443 445-450 452 
456-457 460 462 464 471 479 482-483 
485 488 490-498 505 507 510 516 519- 
522 524 527-532 535 538-539 542-545 
548 551 553 555 561-562 566 569 571 
574 580-583 588-589 593 597 601-608 
611-612 614-615 617-618 621-622 624 
630-635 642 644 646-648 650-652 655 
657 659-661 664-665 668 672 674 689 
693-699 701-702 708 711 715 717 724 
728-730 732 734-735 738-740 745 747- 

7Sf> 7S^-7^5-7S7 7/»1 7ffl.7/i4 7fifi.7<iO 
772-771 775 780-781 789-791 793-795 
799-800 802-806 809 812 818-819 821- 
822 826 829-830 832 834-835 841 843 
845 856 858-859 861 864 866 870 872 
876 880 883 885 887 893-898 902 906- 
916 918 921 925-926 930-931 933 942- 
943 946 948 950-951 953-954 958-960 
962-965 967 969-970 972 977 


adult brain 


Clontech 


ABR011 


57 196 270 304 344 436 834 


adult brain 


BioChain 


ABR012 


14 82 121-122 168 691 


adult brain 


Invitrogen 


ABR013 


72 108 263 270 336 425 492-494 732 787 
790 826 880 


adult brain 


Invitrogen 


ABR014 


293 394 399 764 768-769 928 967 


adult brain 


Invitrogen 


ABR015 


738-739 764 


adult brain 


Invitrogen 


ABR016 


320 374 396 399 405 684 742-743 767 
931 947 967 


adult brain 

• 


Invitrogen 


ABT004 


21 33-34 37-38 47 52 57-58 69 72 91-93 
109 119 122-124 126-127 135 142-143 
158 167-168 185-188 194 200212 232 
242 246 255 258 270 277 279 293 301 
312-313 319 322-323 331 341 346 348 
371 374 388 391 394 399 401 409 41 1 
429 436-437 456 462 477 488 496 498 
510 512 515 539 542 545 549 559 563 
573 579 587 589 601-605 612 620-621 
624 640 643 647 681 715 723 728 732 
735-736 740 745 748 753 766 785-786 
792-793 797-801 812 822 829-831 853- 
856 859 876-877 884 893-894 908-909 
918 925 933 950 969 978 


cultured 
preadipocytes * 

• 


Strateeene 


ADP001 


428-29 6993 114 121 132-133 135 151- 
152 159 167 172 178 181 184 190 194- 
195 203-204 209 217 219 240 248 260- 
262 267 273-274 277 282 297 301 304 
312 314 326-327 361-362 371 374 388 
394 401 403 405 411 420 437 453 466- 
467 470 474 478 496 507-509 517 530 
532-533 584 588 593 602-603 608 610 
617-621 630-631 633 639 642-643 661 
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693 729 746 761 765 769 834 842 848 
887 907 923 947-950 957 967 969 


adrenal gland 


Clontech 


ADR002 


1 3 12-13 21 23-24 27-29 67 74 78 103- 
105 108-109 113 115 118 120-121 128- 
133 149 156 160 172 177 182 214 217 
223 232-233 247 254 269-270 273-274 
277 283 285 288 298-299 308 317 319 
328 338 340 342 361-362 364 372 376- 
377 382 384 401-402 405-406 416 420 
431 437 444 446 448 457 462 484 500 
507 517 524 532-533 539 545 554 561- 
562 564 588 597 602-603 606-607 635 
642 646 649 658 664 674 693 703 730 
740 745 752 759 765 767 775 779 799 
809 817-818 839 845 856 859 863 887 
890-891 896 948 953 958 961-963 973 


adult heart 

• 


GBCO 

• 

■ 


AHR001 


1 3-4 8 10 14 20-21 25 28-29 33-34 37-38 
41 48 54-57 65 69-72 75 78 80 82-83 97 
99-100 108 112-115 117-121 123-124 
128-133 141 144-146 149 152 159 162- 
163 168 172 176 179 181 184 186-187 
190-191 201 203 208-209 212 216-218 
221 223 227 229 233 244 247 249 253- 
255 258 263-264 267 269-270 274 278 
280-282 285 289 291 295 297-299 301 
303-304 308 313 317 321-322 326 328 
334 344 348 352 358 361-363 370-371 
380 382-383 388 394-396 398 401 403 
405-406 410-416 423 425-427 430-431 
436 452-453 464-465 470-474 481-484 
487-488 490 492-494 496 499-500 505- 
506 508-509 514 523 529-530 533 547- 
548 553 558 563-565 577-578 586-588 
590 593 597 601-603 606-608 610-613 
617-619 621-622 626-628 637-638 642- 
644 652 658 661 672 682-683 688 691 
693 697 699 708 71 1 713 715 732 737 
745 747-748 750-753 759 761 765 768- 
770 775 790 802-803 814-815 818-819 
830 837 839-840 842 845 848 859 861- 
862 867 876-877 887 891-892 896 900- 
901 903 905-906 908-909 919-920 922 
925 928 936 939-940 946-947 950 953 
959 967 970-971 973 977 


adult kidney 


GIBCO 


AKD001 


1.3 8 12-14 17 19-25 28-29 33-34 37-39 
41 46-48 50 52 55-60 62 65-67 69 71-72 
75 77-78 82 84 89-90 93 97 108-110 1 14- 
116 118-121 123-125 128 130-133 135 
138 144 146 149 156 159-161 163-164 
167-172 176 179 184 186-187 189-190 
194 196 200-202 204 209 211-212 216- 
217 219 221 223-224 229 232-235 244 
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247 250 253 255-256 258 263-264 268- 
272 274 277-281 283 286 288-290 292 
294-295 297 301 303-309 311-314 316 
319-323 325 328-338 342 348-349 352 
354-355 358 361-363 365 370-371 373 
376-378 380 382-383 388 395-399 401- 
403 405-406 409-413 416 418-420 425- 
428 430-431 440 442 452-454 462 464- 
465 470 472-474 477 479 481 483-485 
487-489 492-495 498-500 504 506 510 
517 522 525 529-530 532-533 539 542- 
543 547 551-552 558 560-564 569-570 
573-574 577-578 580-583 585-590 594- 
596 601-608 610-613 617-621 624 626- 
628 630-631 634-636 639 642-643 648 
652 656 658 664-665 676-677 679 681 
688-691 693 697 699 708 711 715 717 
720-722 724 729-732 738-741 747-748 
751-753 761 765 770-778 780 784 789 
791 793 797 804 813 817 823-824 834 
837 839 842-843 845 848 859 861-862 
864 867 870 876-877 887 889 892-894 
896-897 900-901 903 907 913-915 918 
921 923 925 929-930 932 939 942 946- 
947 949-950 953 958-959 961-963 967 
969 972 977 


adult kidney 


Invitrogen 


AKT002 

• 


1 3 16 21 30 32 35 38-41 46-47 56 77 92 
109 123-124 130-131 146 149 161 167- 
168 172 176 190 209 212 234-235 258 
279 292 301 303 308 3 14 333 355 363 
372 380 383 396 399 402 418-419 426- 
427 431 448 454 461 471-474 488-489 
495 498 504 506 508-509 520-521 530 
537 539-541 545 547 563 582-583 592 
613 617-618 621 623-624 633 655 688 
690 693 699 704 713 732 745 752-753 
761 766-768 770 784 789 797 837 842 
848-849 866-867 877 887 893-894 903 
914-915 925 929-930 937 944-945 947- 
949 955 961 967 984 


adult lung 


GIBCO 


ALG001 


1 3 14 18 28-29 38 54-56 59 92 110 114- 
115 130-131 146 149 156 159 164 167 
176 184 209 217 234-236 240 255-256 
258 263-264 269 271 276 280-281 297 
305 308 312 314 322 325 332 336 344 
353 361-362 388 401 410 420-421 426- 
427 431 465 469 474 484 498 500 506 
508-509 517 530 532 573 592 596 613 
619-620 623 626-628 638 658 679 681 
684 689 717 731 741 771 791 799 817 
834 845 861-862 864 875-876 901 921 
925 928 932 940 947 949 959 962-963 
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967 


lymph node 


Clontech 


ALN001 


3 10 110 146 160 168 196 209 221 269 
278 301 336 348 394 405 411 420 422 
459 464 474 485 503 506-507 532 563 
582 619 623 630-631 642 669 684 697 
713 715 727 747 767 769 789 825 839 
842 849 887 896 913 921 925 


young liver 


GIBCO 


ALV001 


3 14 16 37-38 41 51 56 60 97 104-105 
108 110 117 119 128 130-131 134 139 
149 152 169-172 176 184 189-190 200 
209 212 216 218 228 232 255 258 263 
270-271 275 285-286 292 295 29R-7QQ 

301 304 314 341 358 365 368 376 400 
410-412 431 474 481-482 485 496 500 
504-505 517 520-522 524 530 532-533 
547 551 563 581 583 610-611 621 624 
635 643 691 708 71 1 715 720 752 755 
761 768 796-797 811 818 830 845-847 
852 864-865 867-869 896 899 910-911 
949 958 965 969 972-973 


adult liver 


Invitrogen 


ALV002 

* 


3 37 42 56 60 71 82 104-105 114-115 
117-118 125 130-131 134-135 164 169- 
172 176 179 200 203-204 212 217 223 
226 232 237 244 263 274-275 292 301 
310-312 314 317 349 354 3<>4 3f?R ^7? 

376 398-399 402 426-427 439 442 451 
458 465 474 482 485 490 506 515 525 
527 545 547 552 568 571 573-575 582 
587 594-595 604-605 608 610 621 630- 
631 634-635 637 657 664 690 693 699 
723 726 745 751 763 767 784 793 81 1 
822 845 848 852 856 861-862 864 892 
899 908-909 925 950 958 967 983 


adult liver 


Clontech 


ALV003 


60 134 169-171 275 


adult ovary 


Invitrogen 


AOV001 


1 3 9-10 12-14 16 18 20 22-25 28-29 33- 
35 37 39 41-42 46 48-50 55-57 59 63-67 
69 71-72 75 77-80 82 88-89 92 101 103- 
106 108-110 113 115 119-121 123-126 
128-133 135 138 142-146 149 151-152 
159-161 167-168 172 174 176-177 179 
181 184-190 194 198 200 203 208-209 
211-212 214 217 219 221 224 226 232- 
235 240-242 246-247 249 251 254-255 
258-259 264 269-271 274 276-277 279- 
283 285 288 290 293-294 297 301-304 
306-308 311 314 319-322 325-326 328- 
329 331-332 335-338 341-342 344 348 
354-358 361-363 365 368 370-372 374 
376 379-380 382-383 388 394-396 398- 
399 401-402 405-406 409-412 416 418- 
421 423 425-433 438 442-443 449-452 
454 462 464 466-467 469-471 474 479 
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482-484 488 490 492-496 498 500-504 
506-509 511 515-518 520-524 529-530 
532-533 537 539-542 545 551 555 558 
560-565 569 571 573 577-578 581-583 
585-590 592-593 596-597 600-605 608 
610-611 613-614 617-628 633-637 639 
642-643 646-648 650 652 654 656 658 
664 668-670 672 674 679 681 684 688 
691 693 697-699 701-702 713 717 721- 
722 724 729-732 738-744 747-750 752- 
753 755 759 761 765 767-774 779-780 
783-784 789 793 795-797 801 813-818 
823-824 828 830-832 834 837 839 841- 
842 845 848-851 856 859 862 864 866- 
867 870-871 874-878 881-883 887-889 
891 893-894 896-897 901 903 906-911 
913 919-922 925 928 930 936 939-940 
943-944 946-947 949-950 952-953 955 
957-958 962-963 965 967 969 971 973 
977 981-982 


adult placenta 


Invitrogen 


APL001 


41 56 67 253 301 304 334 380 383 451 
474 479 500 577-578 643 648 729 767 
856 859 866 873 962-963 


placenta 


Invitrogen 


APL002 


3 21 31 38 63-64 78 135 143 168 186-187 
212 232 244 263 280-281 334 336 344 
348 371 374 394 399 461 490 582 588 
602-607 610 620 699 745 769 793 817 
822 859 897-898 923 928 931 943 949 
969 973 


adult spleen 


GIBCO 


ASP001 


1 3 21-22 46 52 54-55 57-58 61-62 72 74 
78 82 88 118 121 130-131 137 152 159 
168 172 189 203 209 217223 234-235 
252 255 263 269 271 274 282 288 290 
301 314 322 335 350 363 394 403 405- 
406 410-412 415 431 459 464 472-474 
482 488 500 506 510 514 517 532 537 
542 561-563 589 593 602-603 610 613 
619 621 636 642-643 655 658 662 674 
676 679 681-682 684 689 691-692 697 
699 715 720 723 729 747-748 769-770 
782 793 818 830 834 845 856 859 862 
877 887 893-894 896 903 906-907 914- 
915 918 925 928 930 940 946 965 967 
977 982 


testis 


GIBCO 

♦ 


ATS001 


6 22 28-29 33-34 41 48 52 62 65 72 97 
106 109 118 132-133 145-146 168 172 
176 183 185 189-191 195 209 211-212 
214 221 223 230 254-255 258 263 269 
283 297 312 314 321 342 352 361-362 
365 380 383 388 395 401 405-406 412 
430-431 441 469-470 474 479 495-496 
500 506 520-521 533 543 545 548 560 
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563 574 582 589-590 593 608 616-618 
620 623-624 638 642-643 697 699 708 
71 1 745 747-748 765 767-768 779 784 
789 812-813 834 837 839 848 859 862 
868-869 875-877 887 889 893-894 896 
928 944 947 953-955 972 981 


Genomic DNA 
from BAC 
63118 


Research 
Genetics 
(CITB BAC 
Library) 


BAC001 

• 


515 


Genomic DNA 
from BAC 
39316 


Research 
Genetics 
(CITB BAC 
Library) 


BAC002 


640 


Genomic DNA 
from BAC 
39316 


Research 
Genetics 
(CITB BAC 
Library) 


BAC003 


640 


adult bladder 


Invitrogen 


BLD001 


50 55 66 71 1 1 1 143-144 148 160 201 209 
223 255-256 280-281 286 305 315 319 
340 394 431 442 488 497 505 518 552 
588-589 621 636 664 676 715 738-739 
769 790 824 837 845 877 887 936 940 
948 962-963 967 


bone marrow 

* 


Clontech 


BMD001 


3 10-13 16 18 20-21 25 28-29 31-34 41 45 
48 52 54-55 57 59 61 65 67 72-73 75 78 
80 82 84 99 103 108 110 114-115 118- 
120 123-124 128 130-133 143-144 148 
152 159-161 163 168 172 174 176 178 
190 192 198 203 209 211 217-218 221 
223-224 227 233-236 244 247 249 252 
254 258 260-262 267 269 272 278 280- 
281 284-285 288 290 294-297 301 304 
308 314 317-318 320-321 325 328-330 
333-335 349 351-354 358 363 365 367 
377 382 388.394-397 400 405 408 410- 
412 418-421 425-428 431 433 435 442 
449-450 453 455 459 464 468-470 474 
478-479 481 484 490 496 504 506 508- 
509 511 519-521 530 532 539 553 558- 
559 561-563 580 582 586 592 599 608 
610 613-614 617-619 623 625-628 635 
638 641-643 658 664 672 682 699 71 1 
713 717 731 734 740 742-743 745 761 
768-771 774 776-778 784 787 789 813 
817-818 822 834 839-840 842 848 862 
866 870 876 885-887 891 896-898 900 
903 906 913 919 921-922 927-928 939 
944 947 950 953 959 961-963 967-968 
970 973 977 


bone marrow 


Clontech 


BMD002 


3 9-10 15-19 30 33-34 39 45 54 57 63-64 
71 82 102 116 119 130-133 148 152 156 
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159-160 168 176 182 224 254-255 271- 
272 282 285 290 297-299 301 305 323 
333 340 344 351-355 358 361-362 364 
367 370 372 387 394-395 399 403 405 
409 41 1 449-450 459 461 468 474 488- 
489 524 530 532 580-582 592 602-603 
611 617-618 621-622 630-632 642 661 
663 694 717 730 734 740 745 752 755 
761 767 769-771 775-778 784 787 81 1 
813 818 832 840 842 849 859 878 887 
893-894 896-898 903 906 908-909 923 
928 944 946-949 953 958-963 965 982 


bone marrow 


Clontech 


BMD004 


54 


bone marrow 


Clontech 


BMD007 


766 887 928 


adult colon 


Invitrogen 


CLN001 


22 37 67 97 117 121 148-149 168 172 190 
200 204-205 232 244 263 268 292 301- 
302 363 377 384 452 455 459 470 530 
582 602-603 619 687 723 728 751 761 
831 861 887 914-916 934 955 969 984 


Mixture of 16 
tissues — 
mRNAs* 


Various 
Vendors* 


CTL016 


358 740 760 


Mixture of 16 
tissues - 

mRNAs* 


Various 
Vendors* 


CTL021 


468 527 928 


adult cervix 


BioChain 


CVX001 


1 3 10 14 22 28-30 37 41 47-48 51-52 54- 
57 71 82 89-90 92 106 108 110-111 117- 
118 121 129-131 135 141 143-146 160- 
161 164 168 172 177 189-190 193 195 
200 204 209 211-212 217 226 229-230 
232 234-235 240-242 246 254 260-263 
268-270 274 277 282 285 292 295 297 
305-308 314-316 319 328 343-344 348 
354 358 363 368 380 382-384 389 394 
396 399 401 405-407 410 416 418-421 
428 430-431 437 442 453-454 459 464 
469 471-473 476 480 484 492-495 500 
504 506-509 516-517 526 530 532 545 
550-551 563-565 569 577-578 585-586 
590 608 611 613 619 621 623 628 630- 
631 634-637 641 643 648 656-658 664- 
665 674 679 682 689-690 693 700 703 
708 713 721-722 724 728 732 742-743 
747 750 752 755 757 761 763 767-769 



* The 16 tissue-mRNAs and their vendor source, are as follows: 1) Normal adult brain mRNA (Invitrogen), 2) 
normal adult kidney mRNA (Invitrogen), 3) normal adult liver mRNA (Invitrogen), 4) normal fetal brain mRNA 
(Invitrogen), 5) normal fetal kidney mRNA (Invitrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal 
skin mRNA (Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA (Clontech), 
10) human leukemia lymphoblastic mRNA (Clontech), 11) human thymus mRNA (Clontech), 12) human lymph 
node mRNA (Clontech), 13) human spinal cord mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) 
human esophagus mRNA (BioChain), 16) human conceptional umbilical cord mRNA (BioChain). 
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779-780 784 788 810-81 1 813-815 822 1 
834 836-837 839 848 861 866-867 871 
874 877 887 891-894 897-898 901 913 
9 1 6 9 1 9 92 1 -922 925 946-947 953 958- 
959 967 969 973 | 


diaphragm 


BioChain 


DIA002 


3 39 1 84 203 43 1 563 848 967 


endothelial 
cells 


Strategene 


EDT001 


3 6 8-10 14 19-24 28-29 33-34 37 39 41 
46 48 52 55-58 62-65 67 69 71-72 75 78 
80 82-83 87 101-102 108-109114-115 
117 123-124 128 130-133 135 138 143 
145-146 149 156 159-160 167-168 172 
174 176-177 179 181 184-187 189-190 
1 94- 1 95 200 203 208-209 212 216-217 
2 1 9 223-224 226-227 229 234-235 244 
248-249 254-256 258 263-264 267 269 
271 274 276-282 285 290-291 294 297 
301-304 308 311 313-314 316-317 320- 
321 323 325-326 328-329 331-332 334- 
337 339-341 344 348-349 352 354-355 
358 361-363 365 367 371-372 375 379- 
380 383 389 394-395 398-403 405-406 
409-412 425-428 437 442-443 448 454 
464 466-467 474 479 48 1 490 492-498 
500 503 506-509 511 517 520-521 523- 
524 530 532 537 540-542 558 561-563 
565 569-570 573 581-583 586 588-589 
596 602-608 610-61 1 613 617-622 625 
628 630-63 1 633-637 642-643 646 648 
650 652 659 661-662 682 688 690-693 
696 698-699 708 712 715 717 720-722 
724 727 729 740 745 748-750 752 761 
765 767-770 772-773 779 784 789 792- 
794 796 802-803 8 1 1 8 1 7-8 1 8 82 1 824 
827-828 830 834-835 837 842 845 848 

™^ m mm mm m^ mm mm ^m mm w m^ mm* mm m>m^ w \-J I m * V mm ^JF ■ \J 

859 861-862 864 866-867 870 876 885 
887 891 893-894 897-898 900 903 906- 
907 913 916 921 925 939 947 950 953 
955 957-958 962-963 967 973 978 984 | 


Genomic 
clones from the 
short arm of 
chromosome 8 


Genomic 
DNA from 
Genetic 
Research 


EPM001 


324 515 640 


esophagus 


BioChain 


ESO002 


97 103 128 371 474 | 


fetal brain 


Clontech 


FBR001 


67 129 156 159 232 267 433 446 503 845 
952 | 


fetal brain 


Clontech 


FBR004 


28-29 185 213 277 350 384 432 485 501 
549 65 1 747 754 761 780 787 848 870 
887 906 958 | 


fetal brain 


Clontech 


FBR006 


10-1 1 14 21 30 32 47 49 56 65 69 72 77- 
78 82 84 97 101 115 118 121 125 128 
130-131 138 142 148 152 159-160 179 
185 188 194 197 203 210 212 214 219 | 
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222 227-229 243-246 249 252 256 264 
270 273 282 285 290-291 293 301-303 
305-306 312 321-322 325 327 339-340 
344 346 350 354-357 363 367-371 374 
388 391 394-395 399 402 405-406 410 
414 420 426-427 436-437 442 444 454 
456-457 460 462 464 470 480 485 492- 
494 507 510 516 524 528 530-532 539- 
542 549 553-554 561-562 580-582 588- 
589 602-608 611 615 617-619621-622 
624 632 636 641-642 646-647 651-653 
661-662 666-669 672 677 691 715-716 
730 735 740 752 754 761 767-770 772- 
775 780-781 799-801 808 818 822-823 
835 843 845 856 859 864 867 876 880 
885 887 890 893-894 896 913 918 926 
942 946-947 951 957-959 962-963 970- 
971 


fetal brain 


Clontech 


FBRs03 


130-131 312 517 637 691 738-739 


fetal brain 

• 


Invitrogen 


FBT002 


3 22 28-31 47 57 63-64 72 75 77-78 86 
94-95 97-98 126-127 135 140 143 156 
159-160 167-168 177 185 190 196 201 
203-204 214 217 230 254-255 258 267 
273-274 277 279 282-283 292 301-302 
305 312 314 323 329 346 348 367 374 
382 394 399 401 403 412 415 420 432 
437 474 482 485 495 507 513 517 527 
529-530 539-542 548 552 579 587-588 
600 604-605 612 617-618 621-622 624 
634 642-643 647-648 650 679 689 693 
699 712 715 742-743 745 748-749 753 
768-769 793 797 829-831 834 845 848 
856 859 893-894 908-909 913 916 931 
933 940 950 967 969 


fetal heart 


Invitrogen 


FHR001 


19 57 130-131 394 431 642 769 844 


fetal kidney 


Clontech 


FKD001 


3 31 33-34 38 48 54 72 160 208t209 21 1 
223 264 269 277 283 290 313 325 341 
348 358 396 418-420 474 484 506 508- 
509 517 520-521 532 547 553 558 567 
569 587 596 608 610 613 619 622 626- 
627 642 679 734 745 818 843 887 896 
903 916 969 971 


fetal kidney 


Clontech 


FKD002 


19 474 726 903 


fetal kidney 


Invitrogen 


FKD007 


3 118 186-187 230 244 271 432 887 969 


fetal lung 


Clontech 


FLG001 


69 132-133 156 168208-209217267269 
274-275 286 354 394 396 406 462 483- 
484 608 619 751 769 771 834 914-915 
925 


fetal lung 


Invitrogen 


FLG003 


3 8 28-29 32 39 50 66 82 88 92 168 186- 
187 200 204 212 226 229 246 274 309 
327 332 368 374 382 394 398 426-427 
431-432 442 485 536 555-557 587 604- 
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605 621 624 636 642-643 661 677-678 
724 753 769 848 859 864 877-878 896 
902 904 914-915 958 


fetal lung 


Clontech 


FLG004 


130-131 394 664 769 942 


fetal liver- 
spleen 

• 


Columbia 
University 

• 


FLS001 


3 8-10 12-13 16-17 19-25 27-29 33-35 37- 
38 41 45-46 48 52 55-58 60-67 69 71-74 
77-78 80 82 84 87-90 104-106 108-109 
112-121 123-125 128-134 138 141 143- 
146 149 151 156 159 163-164 167-172 
174 176-179 181 184 186-188 190 194 
200-201 203 208-209211-212 216-217 
219 224-227 229-230 232 234-235 237 
241 243-244 246-248 254-255 258 260- 
263 267 269-270 273-282 284-285 288- 
290 292-295 297-299 301-306 308 311- 
318 320-323 326 328 332 335 341-344 
348 352 354-359 361-365 367-368 371- 
374 376-380 382-383 388-389394-396 
398-399 401-41 1 413-414 416 418-421 
425 428-430 432-433 437 439 442-444 
449-450 452 456-457 461-470 472-474 
478-479 481-482 484-485 487 490-494 
497-499 504-507 511 514-515 517-521 
523-524 526 529 532 537 540-541 547 
555 558-559 563 575 577-578 580-596 
598-599 601-603 606-608 610-613 617- 
624 626-628 630-631 634-636 639 642- 
643 647-648 654-656 663-665 672 674- 
675 679 681 684 686 688 691 693-699 
711 713 715 717 719-726 729 732-733 
738-740 745 748-749 751-753 757 759 
761 767-770 776-778 780 784 787 792- 
794 799 804 809 81 1 813 817-819 822- 
825 830-83 1 834 837 840 842 845-848 
852 856 859 861-862 865 867-869 871 
874-878 887-888 891 893-894 896-900 
903 905-91 1 913 916 918 923 928 930- 
931 936 939 942 944 946-950 952 958- 
959 961-963 965 967 969-970 972-973 
976-977 981-983 


fetal liver- 
spleen 


Columbia 
University 

• 


FLS002 


3 8-13 15-17 19-20 22 25 28-29 33-35 37 
41 45-46 52 54-56 60-61 63-64 66-70 73- 
74 78 80 82 92 99 104-106 108-109 112 
115-116 118 120-121 123-125 128 132- 
135 139 141 143-144 146 149 152 156 
159-161 167 169-172 174 176-177 179 
181 185 188 190 194 196-197 200 204 
212 214 216-218 223-224 226-230 232- 
235 237 246-247 252 254-255 258-263 
267 270-277 284-286 288 292 294-295 
297-299 301 303-305 308 310 314 318 
320 323 328 330-332 335-337 340 342- 
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344 352 354-355 358 361-365 367-368 
371 373-374 376-377 382 388 394-396 
398-399 401 405-406 409-411 413 418- 
421 429 431 439-440 442-444 451-452 
457 462.-463 466-468 470 474 477-479 
481 483-484 487-488 491 495 499 504 
508-509 516 519-521 524 526-528 530 
532 537 540-541 543 545-547 550-551 
553 555 560 564 568 574-575 577-578 
580-592 596-597 600 602-603 608 610- 
611 613-614 617-618 621-622 628 630- 
631 634 637 639 642 644 647 654 658- 
659 665-667 669-675 679 681 684-685 
688-690 693 695 697 708 71 1 713 715 
717-719 723-727 729 731-734 738-739 

741 745-746 74<3-7'»0 753 759 7fi1 7fifi- 

767 769-770 776-779 782 784 791-792 
794 805 808 817-818 822 824-825 830 
834 837 842 845-849 852 856 859 864- ! 
865 867 874-878 888 891-892 896-900 
903 905-906 908-909 913 916 918 921 
923 925 932 936 939-940 942 944 946- 
947 949-950 953 955-956 958-959 961- 
963 965 968-970 973 977-978 981 


fetal liver- 
spleen 


Columbia 
University 


FLS003 


19 60 78 224 273 275 370 373-374 401 
602-603 639 643 730 732 738-739 748 
752 770 782 928 930 947 949 


fetal liver 

• 


Invitrogen 

■ 


FLV001 


37 55 60 69 72-73 97 104-105 108 113- 
114 116-118 121 135 143 152 167-168 
186-187 195 200-201 209 217 223 240 
244 253 255 275 284 301 311 314 317 
336 342 348-349 358 371 374 382 394 
402 41 1-412 418-419 428 430 442 453 
517 568-569 580 582 584 587 589 601- 
603 606-608 617-618 624 634 639 642- 

X# \*-^^ X* \f X* X# \J X# M- 9 ^m A X# mm W X^ 9 x*«^ m* ■ mm 

644 646 664-665 669 679 715 717 720 
726 745 748 751 769-770 782 791 794 
797 824 830-831 845-847 852 859 870 
899 913-916 925 928 948 956 958 969 
976 982 


fetal liver 


Clontech 


FLV002 


72 418-419 632 


fetal liver 


Clontech 


FLV004 


3 160 169-171 355 367 374 376 547 617- 
618 621 646 717 741 771 836 878 976 


fetal muscle 


In vitro sen 


FMS001 


15 27 32 37 67 72 83 99 112 121 138 167 

A Mm* mm m *^ mm 0 V w # mm W *m ^ » A mm mt mm m* m- * * 

174 177 186-187 190 203-204 211 215 
230 252 259 312 374 403 406 409 457 
461 485 505 517 528 530 540-541 544 
549 554 558 579-580 583 602-603 608 
639 642-643 654 664 699 715 730 737 
751 772-773 788 802-803 810 848 856 
859 864 868-869 887 893-894 905-906 
910-911 923 948 967 
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fetal muscle 


Invitrogen 


FMS002 


15 99 130-131 223 361-362 431 474 505 
581 639 643 666-667 784 790 808 810- 
811 874 880 887 903 946 950 958 962- 
963 973 


fetal skin 

* 


Invitrogen 

• 


FSK001 


3 6 20-22 32-34 41-45 47 49-52 55 63-64 
66 69 77 80 88 91 98 101 1 1 1-1 12 1 15 
126 130-131 135 142 144 146 160 163 
167 176 188-190 196 201 204 208 213 
215 217-218 229 232 244 246 248 255 
263 265-269 274 279-281 283 285 288 
292 294 297 301 303 308 314 321 341- 
342 344 348 354-355 358 361-362 366 
369 371-372 374 381-382 384 386 394 
401 403 405 413 415 428 431 437 440 
460 466-467 472-473 477 481 483 495 
499 504 517 522 532 536-537 539-541 
545 556-558 569 574 576-578 580 584- 
585 587-589 592-593 602-603 606-608 
612 617-618 621 624 634 637 639 642- 
643 647 664 673-674 676 680-681 689 
699 705-707 709-715 724 728-730 738- 
740 745 748 752 765 768-769 772-773 
793 797 817 823 830 834 842 848 859 
861 864 870 874 883 887-888 893-894 
901 904 908-909 913-916 923 925 947 
950 958 962-964 967 975 


fetal skin 


Invitrogen 


FSK002 


3 130-131 146 194 306 354 367 400 405 
474 489 520-521 547 558 561-562 585 
596 730 740 748 755 767 771 810 840 
893-894 946 959 


fetal spleen 


BioChain 


FSP001 


276 563 842 


umbilical cord 

• 


BioChain 


FUC001 


3 20 33-34 39 48 50 52 55-57 65 67 69 72 
77 79 82 92 109 112-113 121 132-133 
138-143 156 167-168 172 174 179 184- 
185 190 194-196 200 202-203 208-209 
229-230 244 269-271 278 284-285 290 
297-299 303 305 308 320 331-332 336 
338 342-343 363 367 372 374 379-380 
383-384 392-394 397 399 402 405-406 
410 425-427 429-430 449-450 474 476 
484 497 499 501 504-505 510 515 517 
532-533 539 549 551 558 563 569 574 
577-578 581 586-587 597 602-603 608 
610 617-619 621 626-627 634-637 639 
642-643 658 663-664 674 690-691 693- 
694 699 713 715-717 720 724 726 729 
738-739 746-747 749 759 761 765 768- 
769 774-775 793 797 807 818 822 837 
848-849 856 862 868-869 874 885 887 
892-894 903 906-907 916-917 919-920 
928 936 939 944 946-947 962-963 967 
969 



119 



WO 01/57190 PCTAJS01/04098 



fetal brain 


GIBCO 


HFB001 

* 

* 


3 9-10 12-14 16 21 25 28-30 32-34 37-39 
41 47-48 52-53 56 65 67 69 71-72 75 80 
84 92 97 103 106 110 114 117-119 123- 
124 127 129 132-133 135 138 141-142 
144-146 148-149 152 156 159-160 168 
172 174 176 179 181 184-185 190 198 
208-209 212 214 219 221 223-224 229- 
230 233-236 240 244 247 251 253-255 
258-259 270 273 276-277 285 297 304- 
305 308 312 314 322-323 325 328 332- 
333 335-337 339-340 342-344 346 352 
354 35 8 363 365 370-372 374 382 394- 
396 398 401 403 405-406 409-412 414 
416 425-427 43 1-432 437 442 445 453 
456 462 466-467 469-470 472-474 479 
483 488 490 492-497 500-501 504 506- 
510 520-521 524 530 537 539 545 549 
552 558 560-562 564 569 579 582-583 
586-587 596 602-608 610-612 614 617- 
624 626-628 630-631 633 635 638 641 
643 647-648 656 658 661 676 679 688- 

£8Q £01 M£-AQ~7 71 1 717 71 ^ HA 77£ 

731 735 745 747-749 752 754 761 765 

767-770 11 A 770-7R 1 7R4-7Rfi 7R0 7QQ- 

800 802-803 813 818-819 823-824 831 
834-835 837 839 845 848 859 864 866- 
867 871 874-875 881 887 891 893-894 
896-897 900 906-907 910-91 1 918 921- 
922 925 927-928 930 943-944 946-947 
950 953 962-963 965 969 972-973 977 


macrophage 


Invitrogen 


HMP001 


86 168 186-187 297 537 608 681 761 845 
877 


infant brain 


Columbia 
University 


IB2002 


2-3 9-10 12-14 16 21 25 27-30 32 37-38 
46-47 49 55-56 58 65 69 71-72 78-79 82 
84-86 91-92 98-99106 109-110 113-115 
118 127-128 130-133 135 138 142 144 
151 156 168 173-176 180-181 185-188 
192 194 196-201 203 208 210-212 214 
217-218 224 229-231 233 236 238 240- 
241 244 246 251-256 259 263 270-271 
277-279 284-285 287 293-294 296 301- 
302 308 312-314 317 322-323 327 330 
333 339 342 345-346 351 354 358 361- 
362 365-366 368 370-371 373-374 382 
388 394-396 402 405-406 411-412 415- 
416 420 424-425 428 431 436-437 440- 
441 444-445 453 456 460 465 474 479 
482-483 488 495-496 498 501 503-504 
506-510 515-517 520-521 524-525 529 
531-532 534-535 537 539-542 544-545 
549 561-562 569 574 577-578 580-583 
586-587 589 592 596 600-608 610 612- 
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* 


613 616-618 620 622 624 629-632 634- 
635 637 641 643-644 650-651 653 661 
663-664 676-677 689 693 695-698 708 
71 1 720-722.724 730 732 735 740 745-. 

7A9 7 e zl 7fi e -7/\/> 7AR.7fiO 770-781 7»e. 
/to f jH #Oj-#OD /Oo-/Oy / 17—1 Ox /oO- 

786 789 791 796 798 800-803 807 811- 
813 818-819 822-824 830-831 834-835 
837 839 842-843 845 854 856 858 864 
867-869 875-877 879 881 887 892-894 
896 903 907-91 1 913 916 919-920 925 
930-932 936 939 943 946-947 953 958 
970-973 977-978 982 984 


infant brain 


Columbia 
University 


IB2003 

• 


3 12-13 21 27-29 32 39 49 69 72 82 91 
113 116 126 128 132-133 142 144 156 
176-177 184-185 188 194 208 212 223- 
224 228 230 244 255 259 267 270 273 
276 293-294 312 320 326-327 337 342 
346 354-355 358 361-363 382 388 390 
394 396 399 402 420 425 431 442 462 
474 482 484 488 495-496 510 520-522 

CO A C^r% C A f\ fit 1 C A c\ r ZT O COO C O Z~ coo 

524 529 540-541 549 563 582 586 588- 
589 596 600-603 606-607 612 617-618 
620-621 632 647 650 679 720-722 724 
735-7 Jo 74o 751 754 7o9 /0O-/00 lyi 
800 807 811-813 818-819 822 824 831 
834 838-840 843 856 864 892 896 907 
919-920 925 930-931 936 947 950 957 


infant brain 


Columbia 
University 


DBM002 


16 47 82 84 201 263 302 376 394 421 440 
488 537 592 606-607 635 740 769 887 
892 906 921 926 971 


infant brain 


Columbia 
University 


IBS001 


84 86 180 185 198 201 203 230279312 
326 346 354 366 388 488 542 581 588 
620 647 664 732 740 785-786 801 807 
822 827 910-911 925 931 


lung, BDro blast 


otrategene 


t conn i 

• 


111 OC /IQ <C *7C 11/11 A\ 1 « IjCA 1*70 

j 11 2.J Vy Oj /j 1 l*r I'rl I JO IOU 1 11. 

190 198 209 217 224 229 234-235 267 
269 274 277 282 284 303 308 312 320 
334 336 352 372 396 398 412 414 437 
453 464 470 481 492-494 508-509 532 
539 581 584 617-619 621 628 633 643 
688 691 745 752 761 768 794 822 837 
848 876 887 953 967 973 


lung tumor 


Invitrogen 


LQT002 


1 3 9-10 12-13 20 31 38 41 46 48 51-52 

^ft 77 TAJIK 7ft £7 RR 1A1 1fl£- 

107 110 114-115 117-118 120-121 123- 
124 128-133 135 143-146 149 151 156 
159-161 163-164 167-168 172 176 178- 
179 184-185 189-191 194-196 200 203 
209 212 216-217 226 228-229 232 234- 
236 241 246 248 256 258-259 263-264 
269-271 274 282-283 285-286 290 292 
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1 


294 297 301 308-309 311 314 317 321 
326 328-329 331 333-334 341 348 352 
354-355 363 365 371 380 382-383 388 
394-395 398-402 405-406 410-41 1 413 
416 418-419 426-427 439 442 452-453 
458-459 461-462 464-465 470-471 474 
478 483-484 490 495-496 499 510 522 
524 528 536-537 540-541 543 548 556- 
558 560-565 571-573 580 582 587-588 
592 597 602-605 608 610 612-613 617- 
622 625-629 633-634 636 642-644 648 
661 664 669 679 688-689 691 693 699- 
700 708 717 723-724 730 733-734 738- 
740 745 747 749 752-753 761 767-768 
770 779 782 784-786 789 793-794 797 
817-818 820 823-824 834 837 842 845 
848 855 857 859 862 864 866 870 875- 
877 887 892 896 900-901 907-909 914- 
915 919-920 923-925 939 943 947 949 
953 958 962-963 965 968 970 972-973 
977 


lymphocytes 


ATCC 


LPC001 


3 9-11 32 47 50 56 71 75 88 97 99 102 
121 125 128-129 135 138 141 149 163 
167-168 212-213 217 233 255 290 294 
301 305 31 1 314 342 372 377 388 398- 
399 410 437 442 453 470 474 481 495 
500 506 510 529 532 537 542 558 571 
579 604-605 610 620 628 637 643 658 
666-667 676 679 697 708 713 728 730 
734 749 765 768 796 807 818 822 834 
839 848 859 875 885 887 896 903 906 
914-915 928 947 973 981-982 


leukocyte 


GIBCO 

• 


LUC001 


1 3 9 11 18-19 21 23-25 27 31-34 39 41- 
42 46-48 52 54-58 62-69 71-72 74-75 78- 
80 82 89-90 93 99 110 115-121 123-124 
128-133 135 138 141 143-146 149 152 
156 159-161 163 167-168 176 179 181 
186-187 189-190 194 198 200 203-204 
209 21 1-212 218-219 226 232-236 240 
244 247 251 253-255 258-259 263-264 
269 271 274 278-279 282-283 285 288- 
290 294-295 297 301-306 311 313-314 
317 320-321 325 328 330-331 335 337 
342 344 348 350-351 353-354 358-359 
361-365 368 371-372 375 388-389 394- 
395 397-401 403 405 407 409-412 421 
425-427 432 437 442 448-450 452 457 
460^*61 468-471 474 476 479-482 484 
492-494 496-498 500 506-510 516-517 
520-521 524 529-530 532 537 540-544 
551 553-554 558 560-565 569 577-578 
580-583 586-587 589 592 596-597 602- 
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603 606-608 610-624 626-628 630-631 
634-635 641-643 654 657-658 661 663- 
665 669 672 677 679 684-689 691 696- 
697 699 708 711 713 715 717 721-724 
728 730 738-740 747-749 755 761 765 
767-769 771 774-779 782 784 789 791- 
792 794-795 797 807-808 811-815 817- 
818 822 824 828 830 832 834 839-840 
842 845 848 856 859 862 864 867 871 
875-877 887 891 893-894 896-898 903 
906-911 913-916 921 923 925 927-928 
930 932 935-936 939 943-944 947 949- 
950 953 958-959 961-963 965 967 972- 
973 982 


leukocyte 


Clontech 


LUC003 


1 41 82 106 119 123-124 160 177 184 201 
212 221 228 271 279 285 295 321 325 
372 394 411-412 443 468-470 530 532 
537 551 569 580-581 613 619 623 626- 
627 642 655 697 761 767 769. 775 789 
809 867 887 923 928 950 


melanoma 
from cell line 
ATCC #CRL 
1424 


Clontech 


MEL004 


3 25 55-56 67 71 78 109 121 129 146 167 
172-173 176 200 209 212 258-259 263 
278 297 301 306 312 335 338 340 352 
361-362 367 388 395 402 410 418-419 
429 437 454 464-465 481 496 500 503 
507 524 532 539 560-562 581-582 587 
589 599 612-613 617-621 623 643 657 
663-664 672 715 724 748 752 761 767- 
768 770 785-786 789 835 848 877 887 
896 916 919-920 947 967 978-980 


mammary 
gland 

* 


Invitrogen 


MMG001 


1 14 1921 28-29 31-37 47 49-51 55 57 
63-67 69 71-72 75-78 92 108-109 111 116 
121 123-124 126 128 130-133 135 143- 
144 148-150 156 159 164 168 172 177- 
179 184 186-187 190 194 200-204 209 
212 217 226 230 232-236 241 244 246- 
247 252 255 258-259 263 268 270 275 
279-283 285 290 292-293 301 304-305 
311 313-314 317 320 322-323 326-327 
330 332 338 342-344 348-349 354 360 
363 367 371 374 380 382-383 385 388 
394-395 398 401-403 407 409 41 1-412 
418-420 426-427 430 435 437 442 449- 
453 459 461 465-468 470 474 477-478 
480 483 485 488 498 500 503-504 507 
515 519 522 524 529-532 538-541 544 
547 555 560 563 565 569 573-574 579- 
580 582 584 587-589 593 597 601-610 
612-613 615-618 620-622 624 634 636- 
637 639 642-644 646-647 650 657 663- 
664 674 676 679 688-689 691 693 696 
701-703 713 715 717 728 730 732 738- 
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739 741-743 745 749 751 753 763 767 
769 772-773 785-786 793 796-797 812 
821-824 830-833 837 848 856 859 861 
864 868-870 876-877 887 891 893-894 
898 903-904 907-911 913-918 921 923 
925-926 930-931 936 942 949-950 958 
961 966-967 969 972-973 


induced neuron 
cells 


Strategene 


NTD001 


9 65 82 92 106 113 142 146 156 172 176 
191 208 221 258 277 328 333 346 361- 
362 371-372 375 388 410 414 418-419 
440 471 484 495 516 524 529-530 592 
610 628 642 650 745 748 752 761 793 
818 848 851 897 


retinoid acid 
induced neuron 
cells 


Strategene 


NTR001 


19 87 184 305 385 440 474 626-627 643 
748 799 834 977 


neuronal cells 

• 


Strategene 

• 


NTU001 


19 33-34 42 70 82 87 109 115 126 146 
172 185 188 194 212 255 269 274 283 
312 317 329 340 361-362 367 379 394 
399 401 410 420 426-427 474 479 507 
530 579 582-583 610 617-618 636 643 
658 732 740 765 769 784 791 793 799 
802-803 818 842 851 864 897 907 932 


pituitary gland 


Clontech 


PIT004 


3 19 123-124 194 255 354 358 373-374 
377 426-427 462 492-494 635 785-786 
793 893-894 


placenta 


Clontech 


PLA003 


138 176 574 896 972 


prostate 

i 


Clontech 


PRT001 


3 9 16 57 65 75 83 108 130-134 138 141 
146 149-150 159 182 186-187 190203 
209 234-235 276 283 322 413 415 442 
449-450 453 480 484 490 499-500 503 
505-506 523 537 543 564 583 602-603 
61 1 619 623 643 650 697 71 1 729 761 
765 770 776-778 784 789 819 822 831 
839 862 866 887 904 907 921 935 962- 
963 967 973 


rectum 


Invitrogen 


REC001 


19 30 33-34 66 108-109 123-124 126 129- 
131 143 149 151 156 164 190 201 240 
247 250 263 268 274 279 287 295 298- 
299 310 314 332 341 354 384 394 401 
420 425 442 446 459 483 485 520-521 
532 545 559 580-581 584 592 602-607 
610 612 615 619 634 637 646 655 664 
683-684 741 769 793 822 870 908-911 
914-916 934 937-938 942 967 973 982 


salivary gland 


Clontech 


SAL001 


16 68 74 84 121 123-124 156 172 190 203 
209 232 248 254 269 292 294 363 377 
395 398 400 402 405-406 410 430 442 
459 462 474 483 485 563-564 579 587- 
588 599 602-603 643 658 699 728 730 
737 741 748 794 822 867 876 897 903 
981 
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salivary gland 


Clontech 


SALs03 


217 254 270 388 610 


skin fibroblast 


ATCC 


SFB001 


517 949 


skin fibroblast 


ATCC 


SFB002 


269 688 


skin fibroblast 


ATCC 


SFB003 


3 203 897 907 


small intestine 


Clontech 


SIN001 


3-4 47 57 68-69 92 99 125-126 130-131 
135 149 151-152 156 159 185 204 241 
246 291-292 318-319 338 343 348 363 
373 375 382 388-389 392-394 397 400 
437 466-467 471 484 500 517 520-521 
525 547 560 580-581 588 599 602-603 
612 624 643 711 731 733-734 757 761 
769 774-775 794 824 864 904 906 910- 
91 1 913 948 953 959 976 984 


skeletal muscle 


Clontech 


SKM001 


15 75 135 146 172 190 218 267 282 308 
410 426-427 474 505 588 620 623 658 
692 713 737 779 790 862 874 878 887 
952 962-963 


skeletal muscle 


Clontech 


SKMs04 


215 


spinal cord 


Clontech 


SPC001 


14 20-21 25 28-29 31 39 46 48 59 78 83- 
84 91-92 103 112-113 135 160 168 172 
176 188 190 205 209 229 232 258 285 
301 308 312-314 321 323 329 346 374 
377 380 383 388 394 398 406 409-410 
431 449-450 453 455 466-467 470-471 
484-486 488 495 497 500 503 508-509 
524 537 539 558 581 586 604-605 611 
619 623 630-631 633 656 663 711 715 
729 736 740-741 761 767 769 776-778 
780 818 822 831 835-836 840 843 859 
861 871 875 887-888 897 906-907 913 
919-920 928 931953 958 


adult spleen 


Clontech 


SPLcOl 


3 6 12-13 66 130-131 178 365 403 431 
461 558 610 715 797 809 876 947 967 


stomach 


Clontech 


STO001 


35 114 130-131 144 155 176 189 206-207 
249 260-262 336 382 398 425 431 453 
461 483 496 500 527 530 580 642 657 
663 669 748 765 768 802-803 839 891 
942 981 


thalamus 


Clontech 


THA002 


30-32 48 66 109 127 130-131 135 142 
145 156-158 168 172 174 185 199 224- 
225 233 246 277 282 286 293 322 332 
334 346 374 384 400 402 420 424 435- 
437 446 466-467 485 503 506 527 542 
549 572 612 615 622 624 633 643-644 
658 676 736 790 794 824 831 835 896 
907 950 969 


thymus 


Clonetech 


THM001 


10 16 20 28-29 32 37 41 52 57 66-67 74- 
75 110 118 121 129-131 141 151 159-160 
208 2 1 1 2 1 8 247 269 289 295 297 320 
325 354 358 365 367 372 378 388-389 
395 398 41 1-412 420 423 435 452 500 
508-509 517 524 532 537 551 558 560 
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569 577-578 582 586 598 608 61 1 622 
643 684 715 721-723 728 740 766 772- 
773 795 834 837 849 864 885 900 921 
946 948 958 962-963 965 972-973 982 


thymus 

» 

• 


Clontech 


THMc02 


1 3 9-11 16 21 27 32-34 38-39 51 55-57 
66 72 74 77-78 80 82 89-90 101 1 12 1 15 
118-119 121 123-124 126 138 144 152 
159 168 174 176 178 186-188 197 200 
208 212-214 217 225 233 243-244 246 
254 256-262 279 282 285 288-289 296- 
297 313-314 322 334 343 354-355 358- 
359 363-364 367-368 372-373 382 387- 
389 395 400 402 41 1 414 426-427 437 
440 442 449-450 454 457 462 464 469 
474 479 481 485 490-491 506 508-509 
511 517 522 526 528 532 542 551 554 
561-562 564 566-570 580-582 585 589 
597 599-600 602-608 611 613-614 619- 
621 625 628 630-631 644 646 655 669 
672 677 6,84 686-693 697 713 717 720 . 
728 740 746 749 760-762 767 771 775 
794 797 804 808 811 816 818-819 837 
840 859 880 883 887-888 896-897 903 
908-911 913 916 924 936 947-948 950 
962-963 965 967 970 


thyroid gland 

• 

• 


Clontech 


THR001 


3 8-9 14-15 19-22 28-29 39 41 55-56 66 
69 71-72 78-79 97 104-105 109 113 115 
119 121 123-124 130-133 135 138 143- 
144 146 148 151-152 156 159-163 165 
168 172 174 177 183-184 196 199-200 
203 209 21 1 215-218 228-229 232-236 
244 254-255 258 273 282 290 292 294 
297 303-306 308 31 1 317-318 322-323 
325-326 334.-335 340 342 348 354 358 
373 377 381-382 387 394 398 401-402 
405-406 409-412 416 422 425-427 429- 
431 440 449-453 462 466-468 474 478- 
479 481-484 490 492-496 500-501 505- 
506 517-518 522-525 532 537 540-541 
545 551 558 560 563-564 580 583 587- 
589 593 597 599 606-607 610 617-621 
625-628 633 635 641-643 658-659 664- 
669 674 682 686 688-691 696 699 715 
724 730 740 742-743 747 750 752 759 
761 765-766 768-769 779 789 796 802- 
803 813 818-819 822 831 837 843 845 
848-849 862 864 868-869 871 874 876- 
877 887 893-894 896-897 907-909 912 
919-921 923 925 928 936 940-942 944 
946-947 950 953 955 958-959 962-963 
967 969 973 981 


trachea 


Clontech 


TRC001 


33-34 55-56 69 74 163 172 190 209 212 
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267 270 297 305 314 352 413 426-427 
466-467 500 502 504 580 586 610 613 
633 642 688 691 71 1 724 738-739 774 
782 816 820 839 848 862 868-869 914- 
915 928 968 


uterus 


Clontech 


UTR001 


4 9 18 37 63-64 74 108 114-115 130-131 
160 166 179 184 190 209 233 249 269 
285 301 314 327 337 348 384 394 399- 
400 403 406 411 425 431 434 437 440 
462 474 485 490 508-509 526 532 579 
617-619 636 642-643 672 761 769 793 
837 849 864 887 903 906 928 934 947 
967 



TABLE 2 



ED 
NO: 


NUMBER 




DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1 


L06175 


Homo sapiens 


occurs in MHC class I region; ORF 


308 


98 




Y70775 


Homo sapiens 


Fomstatm-related protein ztsta. 


3094 


98 


3 


X15187 


Homo sapiens 


precursor polypeptide (AA -21 to 
782) 


4112 


100 


4 


AF1 10640 


Homo sapiens 


orphan seven-transmembrane 
receptor 


344 


100 


5 


G03798 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7879. 


158 


72 


c 
o 


WojOU/ 


iiomo sapiens 


aecreiea protein clone aazzo_o. 




i f\/\ 

1UU 


7 


Y30162 


Homo sapiens 


Human dorsal root receptor 4 
hDRR4. 


884 


88 


8 


Y15227 


Homo sapiens 


Leul 


391 


100 


9 


Y28817 


Homo sapiens 


pt326_4 secreted protein. 


3338 


100 


10 


X92106 


Homo sapiens 


bleomycin hydrolase 


2445 


100 


11 


Y15228 


Homo sapiens 


Leu2 


445 


100 


12 


U27838 


Mus musculus 


glycosyl-phosphatidyl-inositol- 
anchored protein homolog 


432 


34 


13 


U27838 


Mus musculus 


glycosyl-phosphatidyl-inositol- 
anchored protein homolog 


320 


27 


14 


Y71062 


Homo sapiens 


Human membrane transport protein, 
MTRP-7. 


2323 


99 


15 


U96781 


Homo sapiens 


Ca2+ ATPase of fast-twitch skeletal 
muscle sacroplasmic reticulum, adult 
isoform 


5145 


100 


16 


M16653 


Homo sapiens 


pancreatic elastase IIB zymogen 


1435 


99 


17 


Y13398 


Homo sapiens 
« 


Amino acid sequence of protein 
PR0346. 


1749 


99 


18 


Y02283 


Homo sapiens 


Secreted protein clone br342_l 1 
polypeptide sequence. 


1399 


99 


19 


Y53030 


Homo sapiens 


Human secreted protein clone d24_l 
protein sequence SEQ ID NO:66. 


1371 


100 


20 


AL031320 


Homo sapiens 


dJ20N2.5 (novel protein similar to 
fucosidase, alpha-L-1, tissue (EC 
3.2.1.51, alpha-l-fucosidase 
fiicohydrolase)) 


2597 


99 


21 


B01384 


Homo sapiens 


Neuron-associated protein. 


1876 


100 


22 


Y68778 


Homo sapiens 


Amino acid sequence of a human 
phosphorylation effector PHSP- 1 0. 


2470 


100 
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SEQ 

ffv 

ID 
NO- 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


IDENTITY 


23 


Y55935 


Homo saoiens 


Human KHS2 orotein 


4781 




24 


Y55935 


Homo sapiens 


Human ICHS2 nrotein 


2807 


1 Uv 


25 


AC024792 


Caenorhabditis 
eleaans 


contains similaritv to TR'OQS05_9 


463 


J 1 


26 


Y07972 


787 


Human secreted orntein fragment 

A Hill •Wl vVW* VIVU r'* wlvUl U UElliivLl ^ 


1540 


100 

Ivv 


27 


X97630 


Homo sapiens 


serine/threonine protein kinase 


3781 


98 


28 


AF1507^S 


ArfiiQ mncniliiQ 


iniulOlUOUJCavilll vTOSollIUvulg iacior 


J J 1*» 


Oo 


29 


AF1507^S 

/vT A J\J 1 J J 


^rftiQ mncr*nlnc 
lrxud iimavililw 


iiiiuruLuoujC'awLLn crossiinKmg lactor 




7A 


30 


Z38011 


Mus musculus 


DMR-N9 


2988 


86 






nonio sapiens 


axonemai uynem neavy cnam 


OUDo 


oci 1 

yy 




/\r oj / z jo 


IVlUo iuusciuus 


coz proiem 




y i 




<A7 1 4(1 


nomo Sapiens 


1 JL.O— iiuciear KJNA-Dinaing protein 


7Cil *7 


100 




QA7 1 aa 


rtomo sapiens 


1 ±>o = nuciear KNA-oinaing protein 


Z89U 


no 

98 


ia 

JO 


ADUjOZJ / 


riomo sapiens 


vj protem -coup lea receptor (jji^z 


17o/ 


i inn 
100 


7*7 


TY7QQOA 

u/yyy4 


xiomo sapiens 


similar to amcynn ot Unromanum 
vinosum. 


6Uo9 


99 


7H 

JO 


A.0JJ50 


riomo sapiens 


serum response ractor-reiated protein 


19oo 


99 


39 


AL022072 


Schizosacchar 
omyces pomoe 


lipoic acid synthetase 


1067 


61 


aa 


i TA701A 
JUJyjO 


nomo sapiens 


aiKaime pnospnatase 


Z/jI 


100 


ill 
4 1 


API 

AJT I JZ7D0 


riomo sapiens 


v^vj1-j4 protein 


1 ACQ 

lUoo 


y© 


A7 


AT 1 1 7£17 


riomo sapiens 


nypOuiencaj protein 




i i nn 


A1 
4J 


AT fniioa 


riomo sapiens 


oiv / h fctZ . 1 \Tio vei protein / 




i nn 


AA 
44 


YARAl 1 


riomo sapiens 


^JMrol 


looo 


i nn 
100 


A^ 


/\OUUZ404 


riomo sapiens 


orgamc canon transporter, jUto 
simUarity to JC4884 (PID:g2143892) 




i nn 
100 


40 


W /oZ4 J 


riomo sapiens 


rragment ot human secreted protein 
encoaeo oy gene iy. 


15r4V 


i nn 


47 


Y41765 


Homo sapiens 


Human PRO 10 83 protein sequence. 


3604 


100 


HO 


A T? A07^ *5 fl 


riomo sapiens 


Hi cnionae cnannei; po4rii; CL»HJ4 


1 ins 
1 Jl/J 


yy 


JO 


T TAO/f 1 7 


riomo sapiens 


zinc tuiger protein Z,Nr 1 jj 


1 JO 1 


j7 


Jl 


a trn/j 1010 


xiomo sapiens 


Keratin 10 


2 j /4 


i An f 
100 




WOJOo 1 


riomo sapiens 


Human secreted protem 1 . 


1 J^O 


yy 


\ ^1 
JJ 


ADUJ JJl/J 


riomo sapiens 


caanenn- 1 u 




1 AA 

100 


^A 


a i 7 A77 

A 1ZUZZ 


*•« mill &44«» 

syntnenc 

OOllduUCl 


XiTDb Q 

MKr-o 


4oJ 


inn 
100 




AI 171 RQ*7 


nomo Sapiens 


OAJ^Zlvll 0. J vrvlA/\vloOy 




inn 

lOV 


JO 


V7iim 

X / JO JO 


xioxno sapiens 


n i ivivi cionc jy / ooj proiein 


HI K 
ol o 


yo 


57 


AF151018 


Homo sapiens 


HSPC184 


955 


100 


58 


AF 125042 


Homo sapiens 


bisphosphate 3'-nucleondase 


1586 


100 


SQ 

J7 


AFl 1 Rfi7ft 


nomo sapiens 


orpnan vj proiem-coupicti receptor 


1 071 

15r / 1 


1UO 


60 


X04494 


Homo sapiens 


precursor polypeptide 


1903 


100 


D 1 




nomo Sapiens { 




J^O 


1 AA 

luo 


£7 


ni «n<»7 
Ul Jl/J / 


xiomo sapiens 


TN A T» 1 
U/vL>-l 


^^7 

Do / 


1 AA 
1UU 


63 


AF260665 


Homo sapiens 


histone acetyltransferase 


1510 


100 


64 


AF260665 


Homo sapiens 


histone acetyltransferase 


1429 


96 


65 


AJ277145 


Homo sapiens 


ras-related small GTPase RAB18 


1073 


100 


OO 




riomo sapiens 


Human secreted protem clone 
dhl073 12 protein sequence SEQ ID 
NO: 106. 


34o 


i nn 
100 


67 


Y82744 


Homo sapiens 


DNA replication and repair 
associated protein (DRASP). 


1028 


100 


68 


Y44486 


Homo sapiens 


Human GPRW receptor polypeptide. 


1721 


100 


69 


AL031228 


Homo sapiens 


dJ1033B10.2 (WD40 protein BING4 
(similar to S. cerevisiae YER082C, 
M. sexta MNG10 and C. elegans 
F28D1.1) 


3196 

• 


100 
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SEQ 

n> 

WO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


Smith- 
waterman 


IDENTITY 


70 


AJ276316 


Homo sapiens 


zinc finger protein 304 


1751 


52 


"71 


"V 1 HI 1 A 


riomo sapiens 


paiapiegin- lute proiem 


Al A.A 


yy 




A 171 <*TnOQ 
Ar I J /UZo 


riomo sapiens 


protein pnospnaxase memyiesterase- i 


4U1 / 


IUO 


74 


Y71082 


Homo sapiras 


Human B-aggressive lymphoma 
(dAJLJ protein. 


1765 


99 


75 


AF225420 


Homo sapiens 


AD025 


734 


100 


76 


X95235 


Homo sapiens 


transcription factor AP2 


217 


100 


77 


AF108420 


Takifugu 
rubnpes 


1 -aminocyclopropane-carboxi late 
synthase 


733 


56 


78 


G01349 


Homo sapiens 


Human secreted protein, SEQ ED 
NO: 5430. 


X* r~ jrx 

650 


99 


79 


AL1 17635 


XT * 

Homo sapiens 


hypothetical protein 


922 


99 ! 


81 


Z85986 


Homo sapiens 


dJ108Kl 1 .3 (similar to yeast 
suppressor protein SRP40) 


tr% df 

865 


77 


82 


AF183414 


Homo sapiens 


hemin-sensmve initiation factor 2a 
kinase 


3231 


99 


83 


G01 143 


Homo sapiens 


Human secreted protein, bfc,Q lu 




no 


84 


U03985 


Homo sapiens 


N-emylmaleimide-sensitive factor 


3744 


99 


o c 

85 


Y17791 


Homo sapiens 


VAX2 protem 


1496 


100 


87 


AF263538 


Homo sapiens 


growth differentiation factor 3 


1944 


99 


88 


^ r f\A*+ C** 

Y19757 . 


Homo sapiens 


SEQ ID NO 475 from W09922243. 


1361 


100 


89 


AF1 61493 


Homo sapiens 


HSPC144 


1185 


100 


90 


AF161493 


Homo sapiens 


HSPC144 


856 


100 


91 


B25780 


mm f\mm 

787 


Human secreted protem SEQ ID 


647 


At 1 

41 


92 


U57344 


Mus m us cuius 


Meis3 


1007 


89 


93 


AF1 72854 


Homo sapiens 


cardiotrophin-like cytokine CLC 


1197 


98 


94 


AL390114 


mr • m « 

Leishmama 
major 


extremely cysteine/valme rich 
protem 


223 


29 


95 


AB016886 


Arabidopsis 
thaliana 


contains similarity to adenylate 
kinase— gene_id:MCA23. 18 


287 


38 


96 


AC005525 


Homo sapiens 


F22162_l 


1855 


96 


97 


B20997 


Homo sapiens 


Human nucleic acid-binding protem, 
NuABP-1. 


3836 


99 


! 98 


AJ006692 


Homo sapiens 


ultra high sulfer keratin 


507 


70 


99 


AF172264 


Homo sapiens 


fraf2 and NCK interacting kinase, 
splice variant 1 


6942 


99 


100 


LI 1239 


Homo sapiens 


homeobox protein • 


717 


100 


10! 


AC004890 


mm mm V 

Homo sapiens 


similar to zinc finger proteins; 
similar to AACOl 956 
(PlD:g2843171) 


2154 


98 


102 


AC003682 


Homo sapiens 


R28830 2 


not 

1287 


48 


103 


AF201839 


Rattus 
norvegicus 


dynamin Illbb isoform 


4270 


95 


104 


Y79510 


Homo sapiens 


Human carbohydrate-associated 
protem CRBAP-o. 


1394 


100 


105 


Y79510 


Homo sapiens 


Human carbohydrate-associated 
protem CRBAP-6. 


1209 


90 


106 


AL096748 


Homo sapiens 


hypothetical protein 


1216 


100 


iUo 


i Ay /zou 


riomo sapiens 


Metaiiotmonem z 


351 


1 Art 


109 


AL034422 


Homo sapiens 


dJl 141E15^ (novel protein) 


433 


100 


110 


AF191338 


Homo sapiens 


anaphase-promoting complex summit 
4 


683 


100 


111 


AL021712 


Arabidopsis 
thaliana 


putative protein 


185 


26 


112 


AF250138 


Homo sapiens 


small stress protein-like protein 
HSP22 


1063 


100 


113 


AL109976 


Homo sapiens 


dJ794I6.1.1 (novel protein) 


4176 


99 


114 


Y36151 


787 


Human secreted protein 


668 


100 
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SEQ 

s n> 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


% 

IDENTrrY 


115 


AF1 10399 


Homo sapiens 


elongation factor Ts 


1666 


100 


116 


AF210317 


Homo saniens 


member GI UT9 


7fiV) 


i no 
99 


117 


Y73328 


Homo sapiens 


HTRM clone 082843 protein 


931 


100 


118 


X04085 


Homo sapiens 


catalase 


2846 


100 


1 1Q 


AF147717 

AT 1*» till 


numo Sapiens 


uoiquuin u- terminal nyaroiase 
UCH37 


16V5 


100 


Ton 


A/jooZ 


riomo sapiens 


microtubule associated protem 


3801 


m* 

99 


121 


AC004882 


Homo sapiens 


similar to CAA1 6821 

V JrlU :gj 25 5 952 ) 


3223 


100 


122 


M933U 


Homo sapiens 


metallothjonein-lll 


421 


100 




vjUj o2 / 


Homo sapiens 


Human secreted protem, SEQ ID 

NO: 7908. 


557 


94 


124 


G03827 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7908. 


222 


53 


123 


Ar232UUy 


Homo sapiens 


peroxisomal trans 2-enoy 1 CoA 
reauctase 


1565 


99 


126 


AB004906 


Ipomoea 
purpurea 


transposase 


146 


20 


127 


M60165 


Homo sapiens 


guanine nucleotide-binding 
regulatory protem 2 


1832 ' 


99 


ion 


vi n** i o 


nomo sapiens 


rarni >: nA . 0 _j fl . - 

carnitine carrier 


i cm 
1592 


100 


129 


U75467 


Drosophila 
me laiiogaSLer 


Atu 


937 


36 


130 


Z21507 


Homo sapiens 


human elongation factor- 1 -delta 


494 


87 


U 1 




1 T j— i ■■■■ ■■ 4* ma * n 

nomo sapiens 


Human elongation iactor-1 -delta 


938 


100 


132 


Y58633 


Homo sapiens 


Protein regulating gene expression 

nnrin 

FKvari-2o. 


6745 


100 


i •ai 
1 




riomo sapiens 


Protem regulating gene expression 

PKOri-^O. 


4818 


95 


lol 




riomo sapiens 


alpna-i acid glycoprotein precursor 


1 f\£.A 

1Uo4 


99 


135 


U72970 


Sus scrofa 


calcium/calmodulin-dependent 
protem Kmase u isoiorm gamma- rJ 


2723 


99 






riomo sapiens 


numan secreted protem, oJbvjf uj 

lNL/. 147**. 


>i co 


100 


137 


AC005102 


Homo sapiens 


small inducible cytokine subfamily A 

XI1CXI1UC1 


627 


99 


138 


AF1 55648 




putative ungcr proLcm 




Q7 


no 

1J7 


AF1 4463ft 


Hnmn conipno 


spiiuigiisiiic- 1 -pnuspnaic lyase 


/ / 


i no 
1UU 


140 






r\ rf\tf>mri h f»Ti n oomma A 1 


d77R 


1 OA 
luv 


141 






tubulin anticrpn 


1 


i no ] 

1UU 


142 


X56667 


Homo saniens 


calretinin 

WHli wUHlU 


1410 


00 


143 


X92763 


Homo saoien^ 




1 UUJ 


i nn 


144 


Y95293 


Homo sapiens 


Human GEF containing NEK-like 


4092 


99 


145 


AF226046 


Homo sapiens 


GK003 


1198 


100 


146 
1 *tu 


M07R77 

irl /■ Oft 






JJ*r 


Ofi 


147 


AJ272212 

« Aw A* * Mm**m M. mm 


Homo saoiens 


nrotein serine kinase 


2196 


100 


148 


AB026491 


Homo sapiens 


PICK1 


2114 


98 


149 


ABO 185 80 


Homo sapiens 


hluPGFS 


1699 


100 


150 


X91868 


Homo sapiens 


sixl 


1509 


100 


151 


AF266505 


Mus museums 


pseudouridine synthase 3 


2135 


84 


152 


U29170 


Drosophila 
melanogaster 


ANON-23D 


883 


43 


153 


G04075 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 8156. 


567 


99 


154 


AY009128 


Homo sapiens 


ISCU2 


138 


100 



130 
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SEQ 
ED 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


sivtrrri- 

WATERMAN 
SCORE 


% 

IDENTITY 


155 


AF141315 


Homo saDiens 


tlljJJ Ja l f *r— IN— 

3 f*Pi"V 1 o li i rn m i n v/ 1 trait e fi» rs* c a 
avct jf i|* i Ubuaanuiljr 1 UculoieiaSe 


1 ftA"7 


100 


156 


AF1 10645 


Homo sapiens 


candidate tumor suppressor p33 

TWfwl Vi r> m 1 r» f t 
irNVJl IlUIUUJUg 


1294 


99 


157 


AF 159297 




cAieusiii"*iijwe protein 




25 


158 


AL 133325 


TTrvmr* cnni pn c 


ujyo*tJr**.j ^jriomeoDox protein 
NKX2B) 


1437 


100 


159 


AF073298 


Homo sapiens 


small EDRK-rich factor 2 


294 


100 


luvl 




jtiomo sapiens 


U l small riDonucIeoprotein 1SNRP 
nomoiog, matcn to Pjuu:g4U5U0o7 


4032 


100 


161 


AR019100 


XJ«»wft conipna 

nonio bapiClll* 


a pr^ i a - - 


yyo 


100 


16? 


Af 1#S97<1 


/\Ta.D 1 CI Up 5 IS 
Ulallalla 


putative protein 


194 


32 




ATftn56QR 


jtiomo Sapiens 


poiy^A^specuic noonuciease 


3351 


100 


164 


AF1 17646" 


rioiDO sapiens 


long ^i>L»-j protem 


2547 


99 


165 




noiuo Sapiens 


simuar to ciliary oynein oeta neavy 

r-Vinin* 7R°yi» Qi'mi'larifv tr« D71AOQ 

cuain, to so ounuarity to rzjiiyo 
(PIDrgl 18965) 


5065 


100 


166 


Ml 0049 

1V1 1 w7t* 


noino sapiens 


nuniaii me Lai i oinion 6 in-* le 


3ol 


100 


167 


AF126484 


Homo sapiens 


CARD4 


4961 


100 




AT?1£1 *1 ft 

/vr ioi3 io 


xiomo sapiens 


JrioJr U I t>y 


1604 


100 






nomo sapiens 


iionnogen oeta cnain 


2482 


100 


17ft 


MrtdOR3 


nomo sapiens 


[ iionnogen oeta cnain 


2679 


100 


171 1 


M58514 


Gallus gallus 


fibrinogen beta chain 


1059 


78 


1 79 
I 


Aru /oo'tj 


Homo sapiens 


lo. /Jva protein 


786 


100 


171 


i A r , OAA77A 
/\^UUf if** 


riomo sapiens 


L/lX-o 


923 


100 


1 74 


z»yoy /*f 


ocmzosaccnar 
omyces pombe 


putative vacuolar protem sorting- 
associated protein 


185 


31 


1 75 


AjOZUj 


r\iasmoaium 
falciparum 


uver stage antigen 


283 


23 


1 7ft 
i /O 


W /*f /ZO 


riomo sapiens 


Human secreted protem ig949 3. 


1879 


100 


177 


AJ222967 


Homo sapiens 


cystinosin 


1920 


100 


1 /o 


AL»uz4 /yo 


caenornaoditis 
eiegans 


contains similanty to TR:076167 


221 


27 


170 




riomo sapiens 


MemDrane-Douna protem jrKU27o. 


1370 


100 


180 


AF151803 


Homo sapiens 


CGI-45 protein 


215 


28 


181 


G02694 


Homo sapiens 


Human secreted protein, SEQ ID 

NU: olio. 


283 


100 


182 


Y17292 


Homo sapiens 


Human cell death preventing kinase 
(DPK-1) protein sequence. 


2676 


100 


iOJ 


• 


Kattus 

j^m1» yj| lift 

norvegicus 


senne-argmine-ncn splicing 
regulatory protem SRRroo 


148 


27 


1R4 


/vr IjIOjj 


xiomo sapiens 


Cvjl-yv protein 


1214 


96 


185 


AF289664 


Mus musculus 


CYLN2 


4673 


y0 


i fix 

loo 




Homo sapiens 


dJ 1 042 K 10^ (supported by 
GENSCAN, FGENES and 

UcNl^Wlbn) 


4059 


100 


187 


AL022238 


Homo sapiens 


dJ1042K10.2 (supported by 
CirsNoCAN, FGENES and 


2332 


100 


188 


X83543 


Homo sapiens 


APXL 


8513 


99 


189 


AF059569 


Homo sapiens 


actin binding protein MAYVEN 


3106 


99 


190 


M18135 


Rattus 
norvegicus 


smooth-muscle alpha tropomyosin 


1306 


95 


191 


AF242194 


Drosophila 
melanogaster 


brakeless-B 


147 


52 


192 


D30689 


Bacillus 
subtilis 


subunit of nitrite reductase 


113 


29 


193 


Y44984 


Homo sapiens 


Human epidermal protein- 1. 


538 


97 
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NUMBER 


SPECtt^ 


DESCRIPTION 


SMITH- 
WATERMAN 


% 

IDEmTTY 


194 


B25679 


Homo sapiens 


Human secreted protein sequence 

encoded hv trenp I S WO ID "NTO-r^R 


760 


100 


195 


AB0203 1 5 


787 


homologue of mouse dkk-1 gene:Acc 


1466 


100 


iy\> 


T 17*?77.0. 


W1US IBU5CUIU& 


jenty 


9091 


7< 
ID 


1 07 

iy / 




nomo sapiens* 


aj j iuua i . i \ouvei proiciriy 






198 


X56203 


Plasmodium 
falciparum 


liver stage antigen 


512 


24 


199 


Y70775 


Homo sapiens 


FoUistatin-related protein zfsta. 


2027 


63 


200 


X87237 


Homo sapiens 


a-glucosidase I 


A A Al 

4447 


99 


201 


AF101078 


Caenorbabditis 
elegans 


CLU-1 


% AAA 

1393 


46 


OA1 

202 


X04571 


Homo sapiens 


precursor polypeptide (AA -zz to 

1 153 J 


OOll 


1 A A 

100 


203 


X00474 


Homo sapiens 


pS2 precursor 


466 


100 


204 


AB029333 


Halocynthia 
roretzi 


HrPlil-l 


y/4 


C A 

54 


ZU3 


Ar 140U19 


Homo sapiens 


nepatoceiiuiar carcinoma antigen 
gene 520 


• 


1UU 


OA< 

ZUo 


A T?/V7 1 aai 

Ar 07 1002 


Homo sapiens 


miwv-reiated peptide l; miktI 


D3Z 


IUU - 


OAT 

207 


AB038162 


Homo sapiens 


tretoil factor z 


744 


100 


208 


U30521 


Homo sapiens 


TV5 1 1 ITT TX A 

P311 HUM 


3o3 


100 


209 


AB00091 1 


Sus scrofa 


ribosomal protein 


/oZ 


f AA 

IUU 


210 


AB 02 1227 


Homo sapiens 


membrane-type- 5 matrix 
metanoprotemase 


3545 


i nn 

IUU 


Zl I 


A T- 1 onnnn 

Ar loUyzu 


Homo sapiens 


cycun Lr ania-oa 


7*799 
Z /ZZ 


1 A A 


212 


AF105365 


Homo sapiens 


K-Cl cotransporter KCC4 


5624 


100 


Oil 

213 


U29244 


Caenorhabditis 
elegans 


similar to human (, I KJtij transiorrmng 
protein {rUxv. ozz 1 o t) 


OUz 


-~ , 

3z 


9 i/i 

Zl** 


a r AQ 1 <.1 Q 


Homo sapiens 


A XAT7XJ1 "2 ' 1 /VistvrA 1 nPAtoin \ 

qjh / /nZ3.i ^novei protein^ 




l nn 1 


Zl J 


XDZ01 1 


Homo sapiens 


muscie aetermmanon tactor 


1 7£9 
1ZOZ 


IUU 


216 


AF083248 


Homo sapiens 


ribosomal protein L26 homolog 


739 


100 


217 


AF00675 1 


Homo sapiens 


T?C /l OA " 


4/SJ3 




218 


AB007859 


Homo sapiens 


KJLAAU399 protein 


3jjy 


AO 

yy 


219 


AK026291 


Homo sapiens 


unnamed protein product 


826 


100 


221 


xro A r\ a e 

Y84045 


Homo sapiens 


Splice variant of cancer associated 
polypeptide UHi-yai l-z. 


r Oil 


A*7 

97 


ZZZ 


Z/O/yyo 


Homo sapiens 


tenascm>K (restrict m^ 


71 fi^ 


i fin 

IUU 


977 
ZZJ 


A 171 7/1CA7 

Ar 1 jhoUZ 


Homo sapiens 


conun isoiorm i 




IUU 


zzt 


I I / / 1 1 


Homo sapiens 


axopy reiaieu auvoanvigen ^/vLii^ 


lO 1 1 


QQ 


225 


AF190051 


Gallus gallus 


hepatocyte nuclear factor la 


443 


81 


99 
zzo 


/VJVUZOZ30 


riomo sapiens 


uxixiaiiicu. pro Iv ill pxUUUwi 




ox 


227 


Z69368 


Schizosacchar 
omyces pomoe 


•nu£2-like coiled-coil protein 


230 


25 


228 


AF275948 


Homo sapiens 


ABCA1 


11763 


99 


zzy 


A CI £11 Oil 

At lol384 


Homo sapiens 


norLzoo • 


9nn#? 
zuuo 


yo 


Z3U 


Y16270 


Homo sapiens 


paraiemm 


1 Q< 1 
IV j 1 


1 AA 


231 


AJ245599 


Homo sapiens 


putative secreted ligand 


7170 

ZJ iy 


AA 

yy 


919 




UTa^%»*% fonlAnc 

nomo sapiens 


T-Tuman ctArrmrVi cnrr^innm n olAnp> 

fXlllilQil 2>LUIHiU<lJ well ^/lllv/lilu l^lvlUC 

HP10412-encoded protein. 


1545 


QQ 

yy 


233 


AP096286 


Mus museums 


pecanex 1 


3623 


93 


234 


V64619 cd 
l 


Homo sapiens 


30-NOV-1990 Human HE1 cDNA. 


796 


100 


235 


V64619 cd 
l 


Homo sapiens 


30-NOV- 1990 Human HE1 cDNA. 


470 


98 


236 


AP227258 


Bos taurus 


RPGR-interacting protein- 1 


1262 


38 


237 


AJ132445 


Homo sapiens 


claudin-14 


1181 


100 


238 


AJ-034562 


Homo sapiens 


dJ684024.2 (prodynorphin (Beta- 


1330 


100 
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% 
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Neoendorphin-Dynorphin precursor, 
Proenkephalin B precursor)) 






239 


AF262027 


Homo sapiens 


elF-5A2 


808 


100 


240 


AL079344 


Arabidopsis 
thaliana 


putative protein 


194 


33 


241 


AC002394 


Homo sapiens 


Gene product with similarity to 
dynein beta subunit 


1542 


51 


242 


AJ271361 


Takifugu 
rubripes 


FRANK2 protein 


303 


30 


243 


AL021918 


Homo sapiens 


D34I8.1 (Kruppel related Zinc Finger 
protein 184) 


1476 


48 


244 


AF190167 


Homo sapiens 


membrane associated protein SLP-2 


1736 


99 


245 


Y10601 


Homo sapiens 


ankyr in-like protein 


5877 


100 


246 


AL121771 


Homo sapiens 


dJ548G19.1.1 (novel protein 
(ortholog of mouse zinc finger 
protein ZFP64) (translation of cDNA 
NT2RP3001398 (Em:AK0O1596)) 
(isoform 1)) 


3628 


100 


247 


L25314 


Drosophila 
melanogaster 


actin-related protein 


984 


47 


248 


X63745 


Homo sapiens 


KDEL receptor 


1095 


100 


249 


AF 112208 


Homo sapiens 


13kDa differentiation-associated 
protein 


816 


100 


250 


AP001707 


Homo sapiens 


human gene for claudin-8, Accession 
No. AJ250711 


1172 


100 


251 


AL136125 


Homo sapiens 


dJ304B14.1 (novel protein) 


778 


100 


252 


AL031186 


Homo sapiens 


bK984Gl.l (supported by FGENES) 


532 


100 


253 


Y17531 


Homo sapiens 


Human secreted protein clone BL205 
14 protein. 


639 


100 


254 


AL049843 


Homo sapiens 


dJ392M17.3 (KIAA0349 protein) 


6741 


99 


255 


AJ242972 


Homo sapiens 


TOLUP protein 


1424 


99 


256 


Y94873 


Homo sapiens 


Human protein clone HP02632. 


1876 


100 


257 


AF279865 


Homo sapiens 


kinesin-like protein GAKIN 


2903 


100 


258 


AL024498 


Homo sapiens 


dJ417M14.1 (novel protein) 


589 


100 


259 


R66278 


Homo sapiens 


Therapeutic polypeptide from 
glioblastoma cell line. 


830 


100 


260 


AF101784 


Homo sapiens 


b-TRCP variant E3RS-BcappaB 


3226 


99 


261 


AF101784 


Homo sapiens 


b-TRCP variant E3RS-lkappaB 


^ 2821 


100 


262 


AF101784 


Homo sapiens 


b-TRCP variant E3RS-lkappaB 


3149 


99 


263 


AF197060 


Homo sapiens 


src homology 3 domain-containing 
protein HIP-55 


2257 


100 


264 


Y86262 


Homo sapiens 


Human secreted protein HAQAR23, 
SEQ ID NO: 177. 


766 


100 


265 


Y56966 


Homo sapiens 


Human SBPSAPL polypeptide. 


2779 


100 


266 


Y56966 


Homo sapiens 


Human SBPSAPL polypeptide. 


1018 


99 


267 


AJ300465 


Homo sapiens 


putative white family ATP-binding 
cassette transporter 


1557 


95 


268 


AC004030 


Homo sapiens 


F21856 2 


3579 


99 


269 


X55954 


Homo sapiens 


HL23 ribosomal protein 


714 


100 


270 


AB033921 


Mus musculus 


Ndrl related protein Ndr2 


185$ 


94 


271 


AF081886 


Homo sapiens 


EROl-like protein 


1905 


99 


272 


AF166492 


Homo sapiens 


small GTPase RAB6B 


1060 


100 


273 


AL022238 


Homo sapiens 


dJ1042K10.4 (novel protein) 


2201 


100 


274 


W88667 


Homo sapiens 


Secreted protein encoded by gene 
134 clone HAIBP89. 


1530 

• 


99 


275 


X00129 


Homo sapiens 


precursor RBP 


1044 


97 


276 


Z47500_cdl 


Homo sapiens 


I l-MAY-1998 Human RHOH gene 
sequence. 


1161 


100 


277 


AB049188 


Equus caballus 


ubiquitin C- terminal hydrolase 


1118 


96 
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IDENTITY 


278 

mm ' W 


AF270647 

■ mJL mm 9 W w w f 


Homo saoiens 


GTT1 

VJ 111 


1564 


1 uu 


279 

^* i m* 


AF 143956 


Kins m use u Ius 


coronin-2 


2414 


OA 


280 

m* Xj w 


R85151 


Homo saniens 


F.nHAthelinl opII nrtl\/nf»ntiiif> 

MlIUVUlvllCU bCU |7Uljr l^C^J UUvt 


911 


09 

yz 


281 


R85151 


Homo sapiens 


Endothelial cell polypeptide. 


1031 


100 




UOJ7HO 


norvegicus 


Q 1 _ 1 nrntAin 
Ol"! pi OiCUl 


^07^ 

J7/J 


OA 

y0 


9M 


V147<fi 
x IH /Oo 


nomo sapiens 


1 jvappa JtJ-iiKe protem 


ZU J / 


100 


286 


AL031316 


Homo sapiens 


dJ28O10.3(HSDHBl 
^ny or oxy sterol a (i l-Deta) 
dehydrogenase 1) 


294 


100 


9fi7 


T\ir>i i no 


nonio sapiens 


too tamily 


177J 


99 


zoo 


A "D AO/CA/1 1 

Ax5(/Z0U4J 


xiomo sapiens 


MS4A7 


1230 


*fl An 

100 


ztsy 


Molooo 


Homo sapiens 


Krueppel-relatea DNA-binaing 
protein 


209 


90 


ooa 

zyu 


AJUUl 0 1U 


Homo sapiens 


mRNA cleavage tactor 1 25 kDa 
subunit 


ion 
1217 


100 


9Q1 

zy i 




nomo sapiens 


Human rKvJiouo (UNi^/ooj ammo 
acid sequence SEQ ID NO:395. 


oy4 


100 


zyz 


"vaahoa 


Jriomo sapiens 


Human molecule associated witn cell 
proiiierauon, iviA^r-4. 


ZJ /U 


100 


zyj 


AJZ /OlUl 


nomo sapiens 


urKtob protem 


zuyy 


1 f\f\ 

100 


294 


AF161406 


Homo sapiens 


HSPC288 


719 


100 


one 

Z95 


YjoOzo 

* 


Homo sapiens 


Protem regulating gene expression 
PRGE-21. 


1276 


100 


296 


T TO 1 1 


Rattus 
norvegicus 


pyridoxine 5 -phosphate oxidase 


1239 


87 


OAT 




Xenopus 
laevis 


ribonucleoprotein 


1624 


83 


Z^O 


ArZZo/JU 


riomo sapiens 


Cytl9 


1729 


AA 

99 


TOO 

zyy 


ArZzo fi\j 


riomo sapiens 


Cytly 


906 


AD 

98 


i i aa 


V*Cj4^7A 


Homo sapiens 


Amino acid sequence of a human 
gastric cancer antigen protein. 


It a 

718 


OA > 

89 


JU1 


A 171 KOI 

Ar 1Z53J J 


Homo sapiens 


NADH-cytocnrome b5 reductase 
lsoiorm 


loOo 


1 A A 

100 


JvZ 


X JZZUO 


riomo sapiens 


Human receptor molecule (KJiC) 
encoded by Incyte clone 2825826. 


lo/o 


98 


iao 


A XDATsfl*. 
AJrZ<»/ jOj 


riomo sapiens 


nepatoceiiular carcinoma associated 
ring linger protem 




100 


304 


AF208844 


Homo sapiens 


BM-002 


428 


100 






nomo Sapiens 


similar 10 r^iu.gjo / /y*w 


1 Ofifi 
lVoo 


1 AA 

1UU 






/vtaDiaopSiS 
tVlflllftnfl 

UICUIBIIQ 


puiauve protein 


9ih 

Z ill 


7< 1 
ZD 


307 


Y 10530 


VTomA ^nnipnQ 




1 64 S 


inn 

Ivv 


1 308 


AF 180681 


Horn a ssnieriQ 


gUMilUV UUvlvUUUv vAvUOUKw Idvlvl 


J J7 r 


i nn 


309 


AF1 11856 


Homo sapiens 


sodium dependent phosphate 


3591 


99 


310 


Y13583 


Homo sapiens 


G-protein coupled receptor 


2171 


100 


i 

j 1 1 




nomo Sapiens 


cchoijiuz ^mercapio pyruvate . 

SUJIUIu BuSXcTase Z.o* 1 .Zyy 


1 SOR 

1 J70 


i nn 
IUU 


312 


X79535 


Homo *;anipn«! 


Vipta til Ail Tin 


2348 


mo 


313 


AF070658 


Homo sapiens 


HSPC002 


861 


100 


314 


AF078866 


Homo sapiens 


SURF-4 


1395 


100 


317 


• Z37986 


Homo sapiens 


phenylalkylarnine binding protem. 


1258 


100 


320 


AB047892 


Macaca 
fascicularis 


hypothetical protein 


258 


82 


321 


Y25755 


Homo sapiens 


Human secreted protein encoded 
from gene 45. 


1440 


100 


322 


ABO 16531 


Homo sapiens 


PEX16 


1741 


100 


323 AL391141 ! 


Arabidopsis 


putative protein • 


274 


49 



134 



WO 01/57190 



PCT/USO1/04098 



SEQ 
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WATERMAN 


% 

IDENTITY 






thaliana 








325 


AF140501 


Homo saDiens 


DNA do Ivm erase iota 


3691 


00 

77 


326 


X96698 


Homo sapiens 


D1075-like 


1450 


96 


327 


AF 152325 


Homo saDiens 


orotocadherin pamma AS 


4769 


i on 

1UU 


328 


AF151803 


Homo sapiens 


CGI-45 protein 


1970 


100 


329 


X74070 


J1VJI1 JU OdJJlCIlS 


u cuibL.1 lpiion iaciur Dir J 




81 


330 


AF171102 


Homo sapiens 


retinal degeneration B beta 


1302 


95 


j j i 


ws4n4n 


nomo sapiens 


riuman mieneron-inauciDie protein, 

IJTT7T 


484 


98 






nomo sapiens 


cranscripnon-associaiea zinc ribbon 
prowin 


o91 


100 


333 


U19181 


Rattus 
norvegicus 


Rabin3 


2129 


90 




KJVJJO 1 1 


Homo sapiens 


Human secreted protein, SEQ ID 


621 


100 


335 


AL008582 


Homo sapiens 


bK223H9.2 (ortholog of A. thaliana 

rzjr l.oj 


626 


100 


336 


AF1 10774 


Homo sapiens 


adrenal gland protein AD-001 


647 


100 


337 




riomo sapiens 


JvTuppei-type zinc linger protein 


lo74 


CO 

58 


338 


AF207600 


Homo sapiens 


ethanol amine kinase 


129 


100 


340 


AC020579 


Arabidopsis 
thaliana 


putative 

phosphoribosylformylglycinamidine 
syntnase; z550y-^yy50 


3283 


50 


341 


Y28576 


Homo sapiens 


Secreted peptide clone pe503_l. 


944 


100 


5^2. 


T T^OOT/I 

Ui>£z74 


S acchar omyce 
s cerevisiae 


Ycx386wp; CAI: 0.12 


191 


rf% art 

37 


343 


A01771 


synthetic 
construct 


vascular anticoagulating protein 


1661 


99 


344 


AF220052 


Homo sapiens 


uncharacterized hematopoietic 
stem/progenitor cells protein 


1285 


100 


345 


Y70400 


Homo sapiens 


Human cell-signalling protein-2. 


754 


100 






nomo sapiens 


riuman ietai brain cJJNA clone 
vclo_l derived protein. 


9o2 


100 




/vr io j*fxo 


riomo sapiens 


zo.4 Kua protein 


1329 


100 


348 




Araoicopsis 
in an an a 


putative cleavage and 
poiyaaenyianon specinty tactor 


1383 


55 


349 


AL032631 


Caenorhabditis 
eiegans 


Y106G6H.8 


194 


39 


350 


U70669 


Homo sapiens 


Fas-Iigand associated factor 3 


167 


23 


351 


I 7J*tUO 


nomo sapiens 


Ammo acia sequence 01 a potassium 
channel interactor protein. 


i i no 
1 loz 


92 


352 




Tlrncnnhilo 
Ul\Ja\jpU.Ua 


anonz./w 


ill 


**o 


353 


AJ271684 






ton 

1U1J 


inn 


354 


AF099100 








oo 


355 


U51730 


Murine 

ICUJvCXJllu VITUS 


reverse transcriptase 


316 


42 


356 


D50617 


Saccharomyce 

^ cerevisiae 


YFL042C 


279 


27 


357 


D50617 


Saccharomyce 
s cerevisiae 


YFL042C 


279 


27 


358 


AF161432 


Homo sapiens 


HSPC314 


1059 


93 


359 


AB029488 


Homo sapiens 


Cllorf21 


758 


99 


360 


AJ251024 


Homo sapiens 


putative odorant binding protein ag 


1239 


100 


36] 


U4328I 


Saccharomyce 
s cerevisiae 


Lpg22p 


2074 


74 


362 


U43281 


Saccharomyce 
s cerevisiae 


Lpg22p 


2153 


74 
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IDENTITY 


363 


AC007153 


Arabidopsis 
thaliana 


100632 


156 


24 


364 


AF197927 


Homo sapiens 


AF5q31 protein 


3992 


99 

mW *r 


365 


D28500 


Homo sapiens 


mitochondrial isoleucine tRNA 

m m m mm^mr * ■ * s %VA ■ W W mm m mw mm* M ~ • A 

synthetase 


4286 

■ Mm* \f 


98 


366 


X97868 


Homo saoiens 

m\ m*m*mm*m^\f «WM 1 vuw 


arvlsulohatase 


3141 


98 


367 


AL162048 


Homo saoiens 


hvDOthetical nrotein 


\ 1532 




368 


L36062 


Mus musculus 

******* mmvmnj 


sterotdofrenic acute repulatorv 
protein 


189 


2S 


369 


AF1 13249 


Homo saoiens 


muftinle domain nuteHvfi nuclear 

iumu[/iw uuiunui puiuUTv uuwivw 

nrotein 




50 


370 


M15888 


Bos taums 


endozenine-related nrotein nrenir^nr 


2425 


fid 


371 


X66363 


Homo sapiens 


serine/threonine protein kinase 


2562 


100 


372 


W74802 




Wiimnn cpf^Tf>1"pH nrAt^iTi pnrnHpH Kv 
ffene 73 clone HSOEL25 




RO 


373 


AF1 00772 


Homo sapiens 


tenascin-Ml 


11535 


99 


374 


AF090934 


TTnmn cnnipn c 






inn 


375 


AB021643 


Nnmn cnniftnc 


irnnnHntrnnin inHi trnncrftntiAn 

gUUQUUU UJJiJl lllllUdU.lv UtUDU JUUUU 

renressor-3 


27151 


OO 


376 


AB049758 


Homo sapiens 


MA WD binding protein 


1331 


• 100 


377 


AF070666 






too 


y t 


378 


S59342 


Mus sp. 


nuclear pore complex glycoprotein 


464 


60 


379 


AF149205 


Mus musculus 


Su(var)3-9 homolog Suv39h2 


1690 


88 


1RO 

JOv 


AT779700A 


nomo sapiens 


u jl/It- glucose . giy coproiem 
glucosy ltransferase 2 precursor 


/OJ 1 


oo 


381 


AF1 18566 


Mus musculus 


hematopoietic zinc finger protein 


1769 


92 


^R7 


a lrnofwcio 

/ITwUUUO 1 y 


nomo sapiens 


unnameo proiein prouuci 


oil/ 


1 oo 


383 


AF227906 


Homo sapiens 


UDP-gIucose:gIycoprotein 
giuuuby luaribicrabc z. precursor 


7851 


99 


384 


AF1 17946 


Homo sapiens 


Link guanine nucleotide exchange 
iacxor ii 


2363 


100 


385 


AF125390 


DrosophiJa 

luCAaliOgcLSicr 


L82G 


139 


41 


386 


Y94907 


Homo sapiens 

• 


Human secreted protein clone 
caiuo iyx protein sequence oHy uj 
NO:20. 


1092 


50 


JO/ 


| Tl 870^ 


oaccaaroiriyce 
s cerevisiae 


i eiiro**cp 


zoo 


7ft 
AO 


•J oo 


AF1773RR 




Cancei^aiupiiiieu urauscnp iiouaJ 
f*oa<^H vntor ARf\«7 


1ft74R 
1U/40 


OO 


389 


AJ002744 


Homo sapiens 


UDP-GaiNAc:polypeptide N- 

acetvlpalactosaminvltran^iferri^e 7 


3469 


96 


390 

mf m+ V 


AF0973 66 i 


Homo «mnien<i 


cone sodium -calcium nntaQciiim 

VvUv dUUlUUI vCUwIUUJ UUUWolUiil 

exchanger 


3166 




391 


AF217525 


Homo saoiens 


Down svndrome cell adhesion 
molecule 


5337 

mfmfmf W 


60 


392 


U81035 

%mf V A V/ — ' «S 


Rattus 
norvegicus 


anlcvrin bindinp cell adhesion 
molecule neurofascin 


3967 


91 


393 


X65224 


G alius callus 


neurofascin 


4097 


78 


394 


X13916 


Homo sapiens 


LDL-receptor related precursor (AA 
-19 to 4525) 


4292 


99 


395 


AF151083 


Homo sapiens 


HSPC249 


444 


98 


396 


AB017026 


Mus musculus 


oxysterol-binding protein 


2173 


98 


397 


AL035587 


Homo sapiens 


dJ475N16.4 (KIAA0240) 


2393 


100 


398 


W74813 


Homo sapiens 


Human secreted protein encoded by 
gene 85 clone HSDFV29. 


722 


92 


399 


Y71110 


Homo sapiens 


Human Hydrolase protein-8 
(HYDRL-8). 


1637 


99 
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ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


% 

IDENTITY j 


400 


AF039718 


Caenorhabditis 
elegans 


contains similarity to lupus LA 
protein nomoiogs 


325 


43 


A f\ 1 

4U1 


ACrUUUo/ / 


Metnanomerm 
ODacter 

4-1* ni »n #«oi ifotro 
Ulci lilUaULUU U 


conserveu protein 


1 


3D 




i z / /yj 


riomo sapicua 


TJhtti c»ti c f*rYP>tf*r\ nrntpin pnrnHpH V«\/ 
XxUIllall dCUClCU piUlCU-L CUSA/vlCU uy 


1 j j 7 


QO 

77 


403 


Z50853 


Homo sapiens 


CLPP 


615 


100 




"Yn*3 A7^ 


Kauus 

• 

norvegicus 


nuosoniai pruicin Ljjd ^aa i 1 iuj 


j / \j 


00 

77 


*tUO 




rxomo sapiciia 


i^wivjir pi uuciil 


111 


AA 


407 


U20239 


Mas musculus 


fibrosin 


288 


76 


Ann 
401/ 


AL033378 


Homo sapiens 


QJ3ZJJV14.1 ^jviAAO/ 70 protein^ 


ouzo 


00 

77 


410 


X54326 


Homo sapiens 


giutammyi-tKjN a syntnetase 


/J / / 


00 

77 


41 1 


AO 1 JO J \ 


dos taurus 


poiynucieouae aaenyiyiuansierase 


J/lJ 


OT 

7 / 


412 


a t?o ni 
AF217190 


Homo sapiens 


iVLL»-bLrl protein 


/ 1 


OO 

7^ 


414 


G02815 


Homo sapiens 


Human secreted protein, SEQ ID 

XNvJ. O07O. 


314 


95 


415 


A J 245 922 


Homo sapiens 


aipna-tuDuiin o 




1 nn 

1UO 


410 


Ar 20303 2 


Homo sapiens 


neuron i am enx protein 




0 1 

>C 1 


/ill 
41 / 


z,y7o33 


Homo sapiens 


cjouaj .z.i ^novei protein ^lsoiorm 


1 JO / 


1 (\(\ 

lUv 


A 1 O 

418 


AJ404326 


Homo sapiens 




10 / J 


00 

77 


419 


AJ404326 


Homo sapiens 


SR+89 


902 


64 


420 


API 34726 


Homo sapiens 


09A 


D334 


00 

77 


421 


L28125 


Podospora 
anserina 


beta transducin-like protein 


000 
Zoo 


10 


422 


W21733 


Homo sapiens 


in it*- 1 encoaea oy cione oy. 


1 in 


ML 


423 


oo7970 


Homo sapiens 


ZiNr /D— isjvAd zinc ringer 


7 j l 




424 


L28035 


Mus musculus 


protein kinase C gamma 


3768 


98 


426 


Y73373 


Homo sapiens 


hikm clone 72 1 o03 protein 
sequence. 




JO 


427 


Y73373 


Homo sapiens 


h i lvjVL cione 72 1 o03 protein 
sequence. 




AQ 
H7 


428 


Xolllo 


Homo sapiens 


i io-2a/K±Jifs-2a 


O/O 


1fifl 
1UU 


./ion 

429 


Z96932 


Homo sapiens 


nuclear autoantigen to 1 4 Kua 


*f70 




430 


AJ277291 


Homo sapiens 


HELG protein 


678 


72 


431 


X82157 


Homo sapiens 


hevin 


jjZj 


OO 

77 


432 


AC007192 


Homo sapiens 


P85B_HUMAN; PTDINS-3- 

VTXT A Ot? DOC DCTA 


3825 


99 


433 


AL021918 


Homo sapiens 


b34I8.1 (Kruppel related Zinc Finger 
protein 104,; 


1713 


50 


434 


AF084464 


Rattus 

• 

norvegicus 


GTP-binding protein REM2 


141 


29 


435 


AT f\ A C\1C\ C 

AL049795 


Homo sapiens 


ajozZLD.z (novel protein j 


} 1*7^1? 

1 /DO 


OS 

70 


436 


M14513 


Rattus 
norvegicus 


(iNa+ ana Jv+j a 1 ±*ase, aipna^m^ 
catalytic subunit 


4ZO7 


00 

77 i 


437 


U33460 


Homo sapiens 


DN A-airected RNA polymerase 1, 


5/ / / 


9o 


438 


D87076 


Homo sapiens 


similar to human bromodomain 
protein BR140(JC2069) 


3067 


100 


439 


L43912 


Macaca 
mulatto 


mannose-binding protein A 


589 


93 


440 


D31763 


Homo sapiens 


ha0946 protein is Kruppel-related. 


927 


49 


441 


U70976 


Homo sapiens 


arrestin 


2068 


99 


442 


B08069 


Homo sapiens 


A human beta-alanine-pyruvate 
aminotransferase (HAPA). 


2343 


99 


443 


AF 100662 


Caenorhabditis 


contains similarity to ubiquitin 


166 


24 



137 
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ID 

NO: 


ACtE&SldNf 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


• 




eiegans 


car ooxyi- terminal nyoroiase ^.riam: 
u v*ri- 1 .iirrirn, score. zo.4o j (t'lam. 

U^xl-Z.iJilllilj SCOre. t ti*D5) 






AAA 


U / OU 1 / 


XxaUUS 

« 

nurvegicus 


XJCT A 1 


zoo / 


no 


445 


AL049569 


Homo sapiens 


dJ37C10.3 (novel ATPase) 


2418 


100 


A A Q 


A r>A~KAt\ 

AJ/4/j4U 


vol vox carten 
i. naganensis 


t jJwj^«#« MBMa1«M A Aviv »A Am««»Aa«M 

nyoroxyproiine-ncn glycoprotein 


lOD 


34 


AAQ 


AJ I 555 


xiomo sapiens 


ditNrzj / protein 


aZUUO 






AJI 555 J J. 


riomo sapiens 


dZsv*r£5t protein 


1 UZD 


yo 


451 


AF170708 


Homo sapiens 


T-box protein TBX3 


3700 


99 


/ICO 

45/ 


A VAAOAOA ! 

AK002080 


Homo sapiens 


unnamed protein product 


1 CAC 

154o 


99 


453 


^ L32977 


Homo sapiens 


Rieske Fe-S protein 


1239 


93 


454 


X51760 


Homo sapiens 


zinc finger protein (583 AA) 


1533 


57 | 


455 


Y01141 


Homo sapiens 


Secreted protein encoded by gene 7 
clone HTLFA90. 


1453 


99 


456 


AB006631 


Homo sapiens 


The human homolog of mouse Cux-2 


6559 


100 


457 


AF067165 


Homo sapiens 


zinc finger protein 3 


977 


64 


458 


AF038169 


Homo sapiens 


unknown 


154 


38 


459 


W75214 


Homo sapiens 


Human secreted protein encoded by 
gene 19 clone HRSMC69. 


1180 


95 


460 


U97002 


Caenorhabditis 
eiegans 


similar to acyl-CoA dehydrogenases 
and epoxide hydrolases; -Pfam 

A ■ ^^h. A Afe A A A> ^ A A » AA 

domain PF00441 (Acyi-CoA^dh), 

A* AA> A AAA) A A JB -A ^ % W 

Score=57.4, E-value= 1 ,7e- 1 6, N=2; 
contains similarity to Pfam domain 
PF00702 (Hydrolase), Score=57.4, 
E-value=le-13,N=l 


583 


37 


461 


AK023114 


Homo sapiens 


unnamed protein product 


t f\A -% 


99 


462 


M93134 


Friend murine 
leukemia virus 


pol protein 


289 


44 


463 


AF055473 


Homo sapiens 


GAGE-8 


232 


47 


466 


Y51415 


Homo sapiens 


Human wild type pKe83 protein. 


2625 


100 


467 


Y51417 


787 


Human pKe83 splice variant protein 


2433 


100 


468 


Y57936 


Homo sapiens 


Human transmembrane protein 
HTMPN-60. 


-A A**** /V 

1629 


96 


469 


D38552 


Homo sapiens 


AAl tm A AB Ak AAi ■ ■ A A 

The hal539 protein is related to 
cyciophilin. 


2995 


100 


470 


Y70013 


Homo sapiens 


Human Protease and associated 
protein-7 (PPRG-7). 


3530 


100 


471 


AJ224747 


Homo sapiens 


A**i ■ ■ « « . AA. A — _ _ * _ 

C- terminal variant of bJNADL 
including 2 ammo acid exchanges 
and an insertion of 28 amino acids in 
frame. 


AAA <^ f£\ 

7969 


100 


472 


W99665 


Homo sapiens 


\ Ta a l_l_T» AA Ah «A «Av AA\a4aM AT 1 j— > -f- _|— ■ A A~"l All AA^^A AW 

Human secreted protein clone 
dul57_J2 protein. 


1 340 


1 fiA 


473 


W99665 


Homo sapiens 


Human secreted protein clone 
aul57_12 protem. 


998 


98 


A*1 A 

474 


X63526 


Homo sapiens 


homologue to elongation factor 1- 

DflmmA from A oalinfl 


2273 


nn 

! yy 


475 


XI 5940 


Homo sapiens 


ribosomal protein L3 1 (AA 1-125) 


644 


100 


476 


M60832 


Homo sapiens 


alpha-2 type VIII collagen 


3581 


99 


477 


AF039697 


Homo sapiens 


antigen NY-CO-3 1 


1213 


97 


478 


AF1 56929 


Sus scrofa 


inflammatory response protein 6 


1588 


83 


479 


AF264717 


Homo sapiens 


FYVE domam-containing dual 
specificity protein phosphatase 
FYVE-DSP2 


5610 


99 


480 


AF044578 


Homo sapiens 


putative DNA polymerase; POMP 


2478 


94 


481 


1 X89750 


Homo sapiens 


TGIF protein 


1413 


100 



138 
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SCORE 


% — 
IDENTITY 


482 


M93107 


Homo sapiens 


(R)-3-hyaroxybutyrate 
aenyorogenase 


1663 


96 


483 


U58334 


Homo sapiens 


Bbp/53BP2 


1556 


41 


AO A 

484 


A T? 1 C 1 CO O 

AF151538 


Homo sapiens 


deoxycytidyl transferase; Revlp 


4281 


99 


485 


Z98884 


Homo sapiens 


dJ467Ll.l (KIAA0833) 


AT gf\f\ 

699 


73 


486 


AJ243874 


Homo sapiens 


* • 1_ " A 

oligophrenin-4 


3682 


100 


487 


Z11737 


Homo sapiens 


flavin-containing monooxygenase 4 


2969 


100 


488 


X56123 


Mus musculus 


talin 


4353 


77 


489 


AJ278112 


Homo sapiens 


putative cell cycle control protein 


335 


23 


490 


W74843 


Homo sapiens 


Human secreted protein encoded by 
gene 1 15 clone HOVBA03. 


1013 


98 


491 


Y41337 


Homo sapiens 


Human secreted protein encoded by 
gene 30 clone HRDDV47. 


509 


36 


492 


X90530 


Homo sapiens 


ragB 


1926 


99 


493 


X90530 


Homo sapiens 


ragB 


1405 


99 


494 


X90530 


Homo sapiens 


ragB 


1893 


96 


495 


AL022394 


Homo sapiens 


dJ51 1B243 (KIAA0395 (probable 
homeobox protein)) 


4990 


99 


496 


YU395 


Homo sapiens 


lanthionine synthetase C-like protein 
1 


2168 


100 


497 


AJ010119 


Homo sapiens 


Ribosomal protein kinase B (RSK-B) 


4001 


100 


498 


GO 1563 


Homo sapiens 

■ 


Human secreted protein, SEQ ID 
NO: 5644. 


330 


100 


499 


X54131 


Homo sapiens 


protein-tyrosine phosphatase 


10465 


99 


500 


GO 1082 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 5163. 


549 


100 


501 


AC004142 


Homo sapiens 


similar to murine leucine-rich repeat 
protein; possible role in neural 
development by protein-protein 
interactions; 93% similarity to 
D49802 (PID:gl369906) 


3676 


100 


502 


ALII 7544 


Homo sapiens 


hypothetical protein 


1226 


100 


503 


AF203032 


Homo sapiens 


neurofilament protein 


5115 


99 


504 


AL034417 


Homo sapiens 


bK2 1 5D 1 1 .2 (similar to rat gene 33) 


2476 


100 


505 


X69090 


Homo sapiens 


190kD protein 


7546 


99 


506 

* 


U58755 


Caenorhabditis 
elegans 

• 


coded for by C. elegans cDNA 
yk34bl.5; coded for by C. elegans 
cDNA ykl3hl0.5; coded for by C. 
elegans cDNA yk46e8.5; coded for 
by C. elegans cDNA yk46d5.5; 
coded for by C. elegans cDNA 
yk43c2.5; coded for by C. elegans 
cDNA yk46e8.3; coded for by C. 
elegans cDNA yk43c2.3; coded for 
by C. elegans cDNA yk46d5.3; 
coded for by C. elegans cDNA 
ykl3fl03; coded for by C. elegans 
cDNAyk34bl.3 


782 


55 


507 


AJ293309 


Homo sapiens 


NHP2 protein 


801 


100 


1 CAO 

508 


U39045 


Rattus 
norvegicus 


cytoplasmic dynein intermediate 
chain 2B 


3241 


97 


509 


AF063231 


Mus musculus 


cytoplasmic dynein intermediate 
chain 2 


3159 


97 


510 


AF202893 


Mus musculus 


Kif21b 


4336 


95 


511 


Y13115 


Homo sapiens 


serine/threonine protein kinase 


5071 


99 


512 


AB030207 


Homo sapiens 


G gamma subunit 


364 


100 


513 


AF039571 


Homo sapiens 


peripheral benzodiazepine receptor 
interacting protein; PBR-DP/PRAX1 


495 


33 


514 


AB037883 


Homo sapiens 


Gb3/CD77 synthase 


1916 


99 
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IDENTITY 


515 


D90868 


Escherichia 
coli 


similar to 


1489 


100 


516 


X98834 


Homo sapiens 


zinc finger protein Hsal2 


5290 


100 


517 


AF055668 


Mus musculus 


aDODtosis-Iinked eene 4 deltaC form 


2904 


78 


518 


AF0 19926 


Mus musculus 


ofotein kinase 


1694 


90 


519 


M34513 


Homo sapiens 


omesa orotein 


317 


91 


520 


Y08612 


Homo sapiens 


88kDa nuclear pore complex protein 


2313 


99 


J At 1 






ft Ji If Da ri 1 1 pi r>o r r>r»rp rr»mnl p v nrntpin 
OOtUSO uuvicai pk/JC V«UIIlJ/lwJV piuicui 




oo 


522 


AL096766 


Homo sapiens 


dA59Hl 8. 1 (KIAA0767 protein) 


2497 


100 


523 


AF 186249 


Homo sapiens 


six transmembrane epithelial antigen 
of prostate 


1790 


100 


524 


AB029012 


Homo sapiens 


KIAA1089 protein 


4933 


100 




AoOzooVJ 


Homo sapiens 


vascular cadherin-2 


CA£1 

59oz 


100 


526 


X7433 1 


TT • 

Homo sapiens 


DNA prunase (p58 subunit) 


1720 


100 


528 


AC007228 


Homo sapiens 


R31665 2 


1488 


47 


529 


X14830 


T Y _ _ _ _ _ * _ _ _ 

Homo sapiens 


acetylcholine receptor beta-subunit 

______ 4, 

preprotein 


2639 


100 


530 


U80446 


Caenorhabditis 
elegans 


coded for by C. elegans cDNA 
ykl72eo.3; coded for by C. elegans 
cDNA ykl58f7.3; coded for by C. 
elegans cuna yKloot/.j, coaea tor 
Dy v>. eiegans cjl/jna. yKi /zeo.3 


420 


39 




Q.HA.Q1Q 

a/OoJo 


ivtus sp. 


DOS 




oo 


J JZ 




nomo sapiens 


ujDovjz.z ^myosin, neavy 
poiypepuoe y, non-muscie ) 


0ft"7R 

70ZO 


i rtn 






nomo sapiens 


acmcan 


977 
z / / 


j 1 


S .A 


ATJUUOIZ 


xiOmo sapiens 


IN -aCCiy igai aC lOSdJIllIl c-*4— v^/- 

sulfotransferase 






JJJ 


AT 191 09ft 
SVUlZiyZo 


xtoulo sapiens 


KA1RT1/1 "X fr\lf*rV<rtrir\ nnH «!pp7 

domain protein) 


jjjj 


go 


jjO 


A T991 f\<< 
AJZ / i Ujj 


A/flic tnitci^iiliio 

lVlUS CUUSCUIUS 


iroquois nomeoDOA proi^ui o 


1724 


/ u 


537 


AF1 80473 


Homo sapiens 


Not2p 


2267 


100 




A T?A*71 ACQ 


iVLus mus cuius 


zinc linger isjn a oinuing proiein 


1U07 






AJrUzj^Dj 


xiomo sapiens 


actin-reiaicu proiem j-oeia 


9910 

ZZ 17 


inn 


540 


AC003030 


Homo sapiens 


E^29828 1 


1401 


70 


541 




xiomo sapiens 


KZ70Z0 1 


99 OA 


l nn 


542 


AL121889 


Homo sapiens 


dJ1076E17.1 (KIAA0823 protein 
(connnues in aluzjoI/j// 


2152 


100 






Kattus 
norvcgicus 




1Z JO 


OR 

yo 


544 


G02650 


Homo sapiens 


Human secreted protein, SEQ ID 
Kin- fi7^i 


644 


97 




VYY7SQ . 


n.uiiiv> sapiens 


UaUabl ipUUll IttOlLJI X r E LTi 


2373 






AT I^TSd 4 ? 


xxUiiiu Sapiens 


UAJODli 1"t. 1 ^uuvci piuicui suumu 

tA a Hufll onprifiritv r)h A5JAh a tfl <? 
iu a uuai sucvjijvii^ uiiiAiyuauuvy 


964 


99 


547 


X83618 


Homo sapiens 


hydroxymethylglutaryl-CoA 

cvntVince* 


2647 


100 




AF 114726 


T-IYviTiri CjtnipnQ 

JlXUIJLIKJ OtlylwlU 


NG37 


4359 


99 


549 


AB035356 


Homo saoiens 


neurexin I-alDha protein 


6948 


99 


551 


AB037901 


Homo sapiens 


gene amplified in squamous cell 
carcinoma- 1 


5215 


99 


552 


AB043634 


Homo sapiens 


PAR-6A 


885 


100 


553 


AP000693 


Homo sapiens 


partial CDS 


4875 


99 


554 


AF002223 


Homo sapiens 


myotubularin related 1 


3490 


100 


555 


AC004893 


Homo sapiens 


similar to NEDD-4 (KIA0093); 
similar to P46934 (PID:gl 1 7 1 682) 


1611 


100 


556 


AJ404468 


Homo sapiens 


axonemal dynein heavy chain 


8328 


100 


557 


AJ404468 


Homo sapiens 


axonemal dynein heavy chain 


11137 


100 



140 
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IDENTITY 


558 


X65&73 


1 1UIIJVJ 9(1JJ*WJ13 


Ir iripein henvv chain 

IvUJwAJUl 11 WO V Y wllcull 


4860 


inn 


559 


AJ277365 


Homo sapiens 


polyglutamine-containing protein 


592 


36 


560 


AF20560O 


Homo sapiens 


transposase-like protein 


407 


27 


DO 1 


A / I 1 AO 


riomo sapiens 


giuiarninyi-pepuoe cyciotransierase 


1 Ol A. 
iy l*r 


1 AA. 
100 


562 


X71125 


Homo sapiens 


giutaminyl-peptide cyclotransferase 


1456 


97 


563 


X^C ill AjI 

X54304 


Homo sapiens 


myosin regulatory light chain 


897 


100 


* y a 

564 


AF250842 


Drosophila 
melanogaster 


111 « M _ ' 

multiple asters 


130 


23 


r s e 

565 


Y58608 


Homo sapiens 


Protein regulating gene expression 
PRGE-1. 


1619 


99 


cue 

566 


AT 1 O 1 Of\0 

AL121893 


Homo sapiens 

« 


DA189K21.5 (novel protein similar 
to retinoblastoma binding protein 
(RBBP9)) 


i Aft 

1012 


100 


367 


AT 11 "TO 

AJLJ 17352 




Homo sapiens 


aJo7oB10.2 (novel protein (ortnolog 
of rat EX084)) 


3713 


An 

99 


JOO 


ArzzooOJ 


Homo sapiens 


piecKSuin L 


lo41 


1 AA 


569 


AF239243 


Homo sapiens 


histone de acetyl as e 7 


3244 


86 


570 


AF087695 


Mus m us cuius 


veli 3 


9o9 


1 AA 

100 


571 


AB046381 


Homo sapiens 


testis -abundant finger protein 


1346 


AA 

99 


572 


AC005551 


-r y • _ 

Homo sapiens 


R26529_2, partial CDS 


1 AO A 

1020 


1 AA 

100 


573 


Y90290 


Homo sapiens 


Human peptidase, HPEP-7 protein 
sequence. 


274 


52 


574 


W76734 


Homo sapiens 


Human mJDia Kno targeting protein. 


/1Z 


52. 


575 


AL121935 


Homo sapiens 


bA517H2.3 (t-complex 10 (a murine 
tcp.nomolog}) 


853 


78 


570 


YooZ17 


Homo sapiens 


Human secreted protein HWHoUM, 

oHV^ ULr IMIJ.l JZ. 




AA 
77 


577 


AL121716 


Homo sapiens 


dJ202D23.2 (novel protein) 


6329 


99 


5 /o 


A T ' 1 11 11 C 

AX121710 


riomo sapiens 


uJzUzUzj^ (novel protein) 


o3zy 


AA 


579 


X92715 


Homo sapiens 


KRAB /C2H2 zinc finger protein 


3102 


97 


580 


X54637 


T T — . . . • 

Homo sapiens 


protein tyrosine kinase 


5564 


AO 

98 


581 


X78817 


Homo sapiens 


— i t e 

pi 15 


1 145 


A A 

44 


582 


AJ251245 


Ratals 

* 

norvegicus 


SECIS binding protein 2 


3086 


71 


583 


AF113125 


Homo sapiens 


E-l enzyme 


581 


100 


SO A 

584 


Ml 9529 


Sus scrota 


ioUistatui A 


1906 


no 
98 


585 


AF 169677 


Homo sapiens 


leucine-rich repeat transmembrane 
protem FLRT3 


3403 


160 


586 


D87685 


Homo sapiens 


similar to human transcription factor 
TFllS (S34159). 


8083 


99 


587 


Y00876 


Homo sapiens 


Human LAJrH- 1 protein sequence. 


11 1 A 

Zl 10 


1 AA 

100 


588 


Y99674 


Homo sapiens 


Human GTPase associated protem- 
25. 


2111 


99 


589 


D86973 


Homo sapiens 


similar to Yeast translation activator 

CiCNI (P1^4ol2o) 


12033 


99 


590 


AL034452 


Homo sapiens 


dJ682J15.1 (novel Collagen triple 
helix repeat containing protein) 


1979 


100 


591 


Y57396 


Homo sapiens 


Human lysoenzyme LYC4 
polypeptide. 


814 


100 




a TOOTMa 


mus museums 


torsuuj protein 




50 


593 


AF1 64796 


Homo sapiens 


NADH:ubiquinone oxidoreductase 
MLRQ subunit homolog 


469 


100 


594 


Y41312 


Homo sapiens 


Human secreted protein encoded by 
gene 5 clone HLDRM43. 


749 


94 


595 


Y41312 


Homo sapiens 


Human secreted protein encoded by 
gene 5 clone HLDRM43. 


824 


100 


596 


Y77123 


Homo sapiens 


Human neurotransmission-associated 
protem (NTAP) 998868. 


2102 


98 


597 


AF2 15703 


Drosophila 


KISMET-L long isoform 


1880 


65 
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j SEQ 
TO 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


% 

IDENTITY 






melanogaster 








598 


AJF070447 


Homo saoiens 


hafrier-to-autointeffration factor 


200 


OA 


599 


X56203 


Plasmodium 
falciparum 


liver stage antigen 


372 


22 


600 


X79828 


Mus m us cuius 


NK10 


909 




601 


AB 004 109 


Cricetuhis 
griseus 


nrio^nriatirfvl<!P'ririp «rvntria*if* 1 
puuauMiiuuj law ius ajr mn (Wv xx 


9969 
zzoz 


yz 


uv/z 


TTQ4Q88 


X/fnc miicf»ii Jnc 

iYLUa lllUoUUIllb 


i^iuip 1 


001 o 

/y iz 


89 


6m 


T 104088 


ivius museums 


Mil In 1 


oooo 
ZoUU 


86 


604 


A POO 69 64 


numo sapiens 


recomDinanou ana sisier enromanu 
cohesion protein homolog 


zojU 


100 


6ns 

OUO 


A PAO£764 
ArUUOZO** 


no mo sapiens 


recomDination ana sister cnromatia 
cohesion protein homolog 


2530 


100 


ouo 


.Aozzou 


nomo sapiens 


rcanvjAr I 


2929 


100 | 


0U/ 




riomo sapiens 


KanOAr 1 


1843 


97 


ouo 


API 6AOAQ 


i/iosopnua 
melanogaster 




OA 'l 


58 


6i n 

Oil/ 


"Y74RA, 1 
.A. /Ho\J I 


jtionio Sapiens 


gamma suDunit 01 1 cnaperonin 


2 /4d 


On 

99 


6T 1 
Oil 


a r ni 1 497 


nomo sapiens 


/1T1A7A10 1 /'nmf a1 tvmtAm\ 

qj Lo / j\ I? . i ^novei proiemj 


ioUo 


100 


01Z 


V71 fl77 
I /1U/Z 


riomo sapiens 


riuman memorane transport protein, 

TVyfTD"D 17 
iYlllvr-1 /. 


A A C 

445 


100 


611 


V 1 6106 
A10J70 


numo sapiens 


precursor poiypepoae -zy 10 

1 1 *A 


i 7A6 


i on 


614 


A If 000981 


Wnmn cnniPnc 
rxuiuu SMipidu 


muinjucu prole hi pruuuui 


1814 






AT*01 1 198 




K^TAA0S56 nmfein 


S76 1 


OO 




Til 0161 




"NHR-1 80 


90S 
ZUJ 


91 
Zl 


617 


AF045555 


Homo <»anien<! 


wV>^cr1 

™ UuWl A 


1908 


10O 


618 


AFQ45555 


Homo saoiens 


wh^crl alternative cnliced Tvofhict 


1318 




619 


U22229 


Felis catus 


ribosomal protein L41 


128 


100 


690 


Y 17160 


Rnmn cpni pti c 


A 6 atpH tifrttPiTi 

AU rciaicu pruucui 


1810 
i or 7 




621 


Y 12065 


Homo sapiens 


hNop56 


2956 


99 




API 77758 


numo sapiens 


uuiquiLin specmc proicasc 10 


7QQ8 


IUU 


623 


AF3 17425 


Homo sapiens 


GAC-l 


3866 


100 


Oz*l 




riomo sapiens 


nypomencai protein 


lzz/ 


OO 

yy 


625 


AC007204 


Homo sapiens 


BC273239 1 


3398 


99 


ozo 


Z^Oo/4/ 


riomo sapiens 


Imogen Jo 


2Uz4 


99 


627 


Z68747 


Homo sapiens 


imogen 38 


1958 


97 


iron 

oza 


Y/U2zy 


Homo sapiens 


Human RNA-associatea protein- 10 
vruNAAJr- 


/> in/ 

3424 


99 


ozy 


A PI 01407 


X—f C*AHfAn r> 

nomo sapiens 


□asopnaryngeai carcmoma associaiea 
gene piuucm- o 


01 J 


too 


610 


AP1 10664 


nuuiu aapiciis 


uixiiSMipuvUAj icguiaiui piuieui 

HCNGP 


1 574 


1 00 
IUU 


611 


AP1 1 0664 


numo aapicua 


uanscripiionai regm«Lur proiein 
HCNGP 


1 1 so 


£0 
07 


632 


Y 17849 


Homo sapiens 


ganglioside-induced differentiation 
associaiea protein i 


1839 


98 


633 

v# J J 


X55740 




5-nucleotidase 


3012 


100 


634 


AF039688 


Homo sapiens 


antigen NY-CO-3 


931 


100 


635 


AF119662 


Homo sapiens 


E46 protein 


2424 


100 


636 


AB007836 


Homo sapiens 


Hic-5 


2544 


100 


637 


AF077818 


Mus mus cuius 


syntrophin-associated serine- 
threonine protein kinase 


2027 


44 


638 


AL035455 


Homo sapiens 

* 


OJ1018E9.1 (VAMP (vesicle- 
associated membrane protein)- 
associated protein B and C) 


150 


26 


639 


AF078844 


Homo sapiens 


hqp0376 protein | 416 


81 
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SEQ 
ID 

NO* 
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SPECIES 


DESCRIPTION | 

*> 


SMITH- 
WATERMAN 


% 

IDENTITY 


640 


U28377 


Escherichia 
coli 


ORJFJ239; was ORFJ191 and 
ORFJE194 before splice 


1198 


100 


641 


A K 024442 


Homo satiiens 


FLJ00032 nrotein 


1677 


JO 


642 


U58682 


Hnmo sani six <s 


rihnsnmal nrntRin S^R ' 

1 IUUjUIIJuI Ul UICUI UZ.O 


340 


inn 


643 


X57412 

/VJ # *TJi> 


T> aftlie rjjftllC 
IvBUU) 1 ullUO 






OR 


644 


AX>UU/,J't O 


WrttnA cani PTl C 
1 IKJLLIKJ dajJlCUj 








646 


Y96202 


Homo sapiens 


IkappaB kinase (IKK) binding 

protein, i zxi jo. 


1178 


98 

• 


Of / 




jvius museums 


jiNJv-Dinaing proiein jiNxsjjr 1 


**ouy 


ol 


O**o 


Ar>UU5/UD J 


ATaDluOpSlS 

thaliana 


coma ins similarity to isoamyi 

acetate-hydrolyzing 

esieiMO^gene_io.iviv^DA.zj 


4U / 


A A 

44 


650 


AC002550 


Homo sapiens 


Unknown gene product 


858 


99 


i OjI 




riomo sapiens 


aiaoetes memtus type l autoanugen 




OO 




AoU 1 55 


Homo sapiens 


zinc ringer 4 1 


4J4y 


1 AA 

100 






I'latynereis 
a u in en in 


H«» protein \AJ\ I - lOij 




100 




a fwvucao 


xiomo sapiens 






i nn 

1UU 


ODD 


YRflA"71 


ivius muscuius 


rah lO 

raoiy 




DO 


030 


Juzo4y 


Jtvauus 

■ 

norvcgicus 


unxnown protein 


oni 

aCVI 




657 


AC006014 


Homo sapiens 


similar to RFP transforming protein; 
similar to P14373 (!>ID:gl32517) 


1331 


99 


658 


X92972 


Homo sapiens 


protein phosphatase 6 


1666 


100 


659 


L35269 


Homo sapiens 


zinc finger protein 


2803 


99 


660 


AC003682 


Homo sapiens 


F 18547 1 


^ 1 Oil 

3184 


96 


661 


X79204 


Homo sapiens 


ataxin-1 


A 1 AC 

4195 


99 


662 


XI 7620 


Homo sapiens 


Nm23 protein 




99 


663 


AB015617 


Homo sapiens 


ELKS 


1501 


80 


664 


r-w c r-r\ O 1 

Z56281 


Homo sapiens 


interferon regulatory factor 3 


2331 


100 


665 


AJ248283 


Pyrococcus 
abyssi 


I^CTOYLGLUTATHIONE 
l-YAoti (JbC 4.4.1.5^ 

METHYLGLYOXALASE) 

(AivlJlJJvC 1 UiVIU 1 Aon ) 
tf*l YHY ATA T\ 


254 


a r\ 

40 


666 


Z70200 


Homo sapiens 


U5 snRNP-specific 200kD protein 


8819 


99 


667 


Z70200 


Homo sapiens 


U5 snRNP-specific 200kD protein 


8589 


97 


005 


ATI CJ/4 <A 

Ar 153450 


Manduca sexta 


juvenile normone esterase omaing 
protein 






££0 

ooy 


A too Tt no 


Homo sapiens 


UrKKo 


*7oai 
/ZJ 1 




670 


X99586 


Homo sapiens 


SMT3C protein 


441 


87 


o/l 


i TtL 1 CO A aJ1 

^Ol5o9jCal 


Homo sapiens 


i7-auvj-199o ujnA encoamg a 
numan Uv-z protein. 




1UU 


672 


AJ132702 


Mus muscuius 


ATFa-associated factor 


3240 


88 


673 


AF204159 


Homo sapiens 


potassium large conductance 
calcium-activated cnannei oeta J a 
subunit 


1486 


100 


674 


G02061 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 6142. 


ceo 


yy 






riomo sapiens 


riuman secrete □ protem, o£.y id 
NO: 5327. 


141 


*77 


676 


ABO 16839 


Homo sapiens 


raobl 


419 


42 


677 


D86970 


Homo sapiens 


similar to myosin heavy chain: 
Containing ATP/GTP-binding site 
motif A(P-loop) 


161 


28 


678 


U83115 


Homo sapiens 


non-lens beta gamma-cry stallin like 
protein 


8569 

• 


99 


679 


AF203687 


Homo sapiens 


prolactin regulatory element-binding 
protein 


2181 


100 
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SEQ 
ID 
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SMITH- 
WATERMAN 


% 

IDENTITY 


680 


M27685 


lVlus musculus 


ultra-hi&h sulohur keratin 


650 


JO 


681 


U04968 


Cricetulus 
eriseus 


nucleotide excision reoair Drotein 


3712 


07 


682 


AF 119663 


Homo sapiens 


G -protein gamma- 12 subunit 


356 


100 


683 


G03733 


Homo saoiens 


Human secreted Drotein SEO ID 
NO: 7814. 


342 


ion 


684 


X67699 


Homo sapiens 


CDw52 antigen 


297 


100 


685 


AF022789 


Homo saoiens 


iinifluitin nvdmlvriniy en7vnip T 


1 R09 


inn 


686 


AJ001006 


T\/fiiR miiQCiilitis 








687 


W03516 


Homo sapiens 


Prostaglandin DP receptor. 


1864 


100 


688 


AF019661 


Mus musculus 


zeta proteasome chain; PSMA5 


1214 


100 


ooy 


Af J 3D 33 / 


Homo sapiens 


stomann related protein 


2036 


100 


690 


G03960 


Homo sapiens 


Human secreted protein, SEQ ID 

NvJl oll41. 


593 


100 


691 


AF161512 


Homo sapiens 


HSPC163 


738 


100 


oy& 


A T €\1 1 1 1 < 


Homo sapiens 


ZXDA, ZXDB (zinc finger X-linked 
protein ) 


4298 


100 






riomo sapiens 


inyroia receptor lnteractor 


oUO 


1 Ark 

100 


694 


AC004542 


Homo sapiens 


OXYSTEROL-BINDING 
PROTEIN-like; similar to P22059 

^JLIJ. g 1 Zy J U 


2533 


99 


695 


AF169411 


Rattus 

• 

norvegicus 


PAPIN 


4144 


52 


696 


Y58168 


Homo sapiens 


Human hydrolase homologue HHH- 

4. 


2144 


100 


697 


AF271994 


Homo sapiens 


dopamine responsive protein DRG-1 


1613 


100 


o9o 


Y41741 


Xjr ' 

Homo sapiens 


Human PRO704 protein sequence. 


1323 


i 100 




AL133506 


Unknown 


/predicuon=(method. "genscan , 
version, l.u , score: 1U9.13 ), 
/preaicnon— ^memoa . 


825 


AO 

48 


70 n 




nomo sapiens 


o urn an goose-iype lysozyme 
mm 

^VJUL I 




i niri i 


701 


ACOO'^014 


T7rtnrm csntpnc 
XaUIXXU OdpiCIld 


VJwllv WXU1 SUilltcU liy It/ lal ivitlllcy - 

SUVVUiv ^XVkjy gMlw 


i ion 


inn 

1UU 


702 


AC003034 


Wnmn cani^ni: 


Optif* wrfh ci m i p ritv to rat If lHnpv- 

specific (KS) gene 


Q^7 


OS 


i 703 


AJ242832 




vaiuaui 


^756 


inn 


704 


S52624 


Homo <:anif»n<% 


unknown 

U1UVUV TT li 


185 


100 


705 


AF005081 


Homo canipn<i 


Qlcin-sneciftc nmtein 


652 1 


ion 


706 


■ Y16793 


Homo «.anien«» 


keratin tvne 1 


2232 


100 


707 


Y44985 


Homo saniens 


Human enidermal Drotein-2 


455 


69 


708 


AF1 13220 


Homo saoiens 


MSTP040 


686 


ion 


709 


Y44985 


Homo saoiens 


Human enidermal Drotein-2 


408 


65 


710 


Y16132 


Homo saoiens 


CDT6 


1874 


100 


711 


Y68775 


Homo sapiens 


Amino acid sequence of a human 
phosphorylation effector PHSP-7 


2407 


100 


712 


X63422 


Homo sapiens 


H(+)-transporting ATP synthase 


209 


100 


713 


AF 169968 


A/filc miicr*i>1uc 
J.VXUO IUUjvUIUs 


DTvJA hi n din it hrotpin DFSBT * 


14/57 


70 


714 


X52563 


Bos taurus 


Dermabilitv increasing nrotein 


383 


29 


715 


AJ277739 


Homo sapiens 


RPB1 lblalpha protein 


480 


98 


716 


ALI35791 


Homo sapiens 


DA162G10.3 (zinc finger protein) 


401 


98 


717 


AF223466 


Homo sapiens 


HT015 protein 


1311 


97 


719 


AF1 17383 


Homo sapiens 


placental protein 13; PP13 


746 


100 


720 


Z98743 


Homo sapiens 


dJ181C9.2 (Rho GTPase activating 
protein 8 (RhoGAP, p50RhoGAP)) 


324 


100 


721 


AL163815 


Arabidopsis 
thaliana 


putative protein 


653 


61 


722 


G01436 


Homo sapiens 


Human secreted protein, SEQ ID 


418 


96 
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ID 

NO: 


ACCESSION 
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SMITH- 
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SCORE 


IDENTITY 








NO* 5*517 






723 


AF282919 


Mus musculus 


Zfp228 


349 


49 


724 


AB023191 


l-Tfvmr* Qnnipnc 

XXVSXJ117 adJJlCJio 


VTA A A07zl nrnt/»in 


2!OJ 


100 


725 


AL031778 


Homo sapiens 


dJ34B21.1 (novel BZRP 
^OBPZooiazapuie recepior ^pcnpnerBJ^ 
(MBR, PBR, PBKS, IBP, 
looquinuiixi^uinaing proiein ) ) i^irvc 
protein) 


920 


100 


726 


at 021939 




QjjjZnZv.z ^aiaenyae 
dehydrogenase family protein) 


1764 


100 


171 


API R949£ 


IvallUo 

« 

ill/lv vglw US 


aryiacetamiae aeacetyiase 


791 


42 


728 


Y08565 


Homo sapiens 


UDP-GalNAc:polypeptide N- 
acetylgalactosammyltransferase 


3331 


99 


770 


API ^13^ 


riomo sapiens 


novel retinal pigment epithelial cell 
protein 


1652 


99 


/ JU 


AT ft7R£OA 


/\raoiaopsis 
thaliana 


putative protein 


277 


55 


731 


i V733S7 


xiomo sapiens 


Hi km clone l /iZ3oo protein 
sequence. 


1720 


100 


737 


AP17RA37 


xiomo sapiens 


PTJ3 «m4ai'« 

oH3 protein 


3302 


100 


733 


VI 7R37 


lit * rr% a n 

riuman 

wllUOgCxlOUS 

•"pf pv\ vtn l c \C 
IGlxUVLTUo XV 


env protein 


223 


34 


754 


Y2885Q 

X AOOJ7 


numo Sapiens 


riuman mesooerm maucxion eany 
response protein ER1. 


2067 


98 


73 S 




^jrycioiagus 

ItUXXXl/UlUo 


protem pnospnatase z/vi t> gamma 

SUUUIlll 


2352 


99 


736 


Y94922 


xxuill U oapicilb 


xiiimdii becreteu proicm cione pvo i 
nrotpin ^ftmiwirp *^PO TT*I Tvin^ft 




f nn 


737 


AB027003 


AtA»4iJ 1 1 IliilVM 1 liar 


nmtpi'n nVir*cnhnt'jjQ** 


^7» 
J /o 




738 


AF112200 


Homo s aniens 


"NfADH-ovirloreriurtsi'iP R1R ciihunit 


7^0 


inn 


739 


AF 112200 


Homo sapiens 


1 NADH-oxidoreductase B18 subunit 


613 


88 


740 


AF302154 


Homo sanien<« 




OjjO 


inn 


741 


B25681 


Homo saniens 


Hutnnn CAcr^t^rf nrntpin ^Rnnpnrp 

X A IXLilOXl SCUvlCU iyii/tCxXX dwUUCIlWC 

encoded bv eene 17 SEO IlS NO # 70 


l*f 1U 


OO 


742 


L27479 

Mar * ■ * 


Homo saniens 


X123 




OO 


743 


L27479 | 


Homo saoiens 


X123 


1706 1 


07 


744 


Y66745 


Homo sapiens 


Membrane-bound protein PRO 1 186. 


588 


99 


745 


AJ001019 


Homo satjiens • 


finer finoer ni*otein 


1707 


OO i 


746 


X68453 


Sus scrofa 


tubulin-tyrosine ligase 


1882 


94 


747 


Y57897 


Homo saniens 


T-Tii-rn t»n trancmpmhrsnp nrnt^in 
xxiixxxojx ti qj laiixTOiim cuxc uxviicui 

HTMPN-21 

A A -J 1TX1 A V A A • 


1 1 / J 


i nn 


748 


AF 15 1069 


Homo sapiens 


HSPC235 


1694 


96 


749 


AF 182404 


Homo sani en <; 


mitAohonHrinl iin<*nnnlino nrnfpin 1 

llillUwUUUUJ lul UXX%«iJtX|JXXlXK UIUICIII X 


lO /*f 


inn 


750 


AL121993 


Homo sarn en i 

Jt XVAA1V *-M 14." 1 vU J 


dJ77<vP7 1 iTMovel nmtein^ 




OO ! 

yy 


751 


AF 149825 1 


Homo saniens 


PACSTN3 




i nn 

IUU 


752 


AL008635 


Homo ^aniens 


H 1^10141 £ 7 fhifrh-mnhilift/ orAiin 
ujj i v/xx i vj.x. ^nigii^xnuDiiiijr group 




OO 

yy 


753 


Y57914 


Homo sapiens 


Human transmembrane Drotein 
HTMPN-38. 


1124 


100 


754 


AF285109 


Homo sapiens 


septin 3 isoform B 


1766 


100 


755 


AF004161 


Oryctolagus 
cunicuhis 


peroxisomal Ca-dependent solute 
carrier 


2371 


95 


756 


Z19585 


Homo sapiens 


thrombospondin-4 


4239 


100 


757 


AP001745 


Homo sapiens 


similar to zinc finger 5 protein 


1857 


100 


758 


AF190664 


Mus musculus 


LMBR2 


555 


72 


759 


AF090326 


Mus musculus 


AE-1 binding protein AEBP2 


1540 


97 


760 


AL096677 


Homo sapiens 


dJ322G13.3 (novel protem similar to 


999 ! 94 
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SEQ 

IT* 
ill 

NO: 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








bovine and mouse beta-soluble NSF 
attachment nrotein fSNAP-betai ^ 






761 


AC003007 


Homo saDiens 


Unknown j?ene nroduct /nartial 1 


649 




762 


U66372 


Bos taurus 


ribosonrifil nrotein S29 

1 luWOVUlBi L/l U kwlll Ijfax 






764 


Y90899 


Homo sapiens 


Dl-like dopamine receptor activity 

lUUVJJLLJr lUg jJi Ulvlii v?J-4\^ AAV A^IVS. 1. 


1152 


100 


765 


U88169 . 


Caenorhabditis 

w iwKum 


similar to molybdoterin biosynthesis 

MTiPD r»rot#»inc 
vflXJCtO UlUlCJilS 


1204 


65 


766 


AL1 18506 




HJS01P7O 3 1 fnovpl HnaT Homain 

protein, similar to mouse and bovine 

pvctpinp ctxin 0 nrotpfri^ 
v^jroiciAXG aiilUg jmiflgiaij 


i noi 

1U7I 




767 


AK024693 


Homo sapiens 


unnamed protein product 


3767 


100 


76R 
/ uo 


71 1 SIR 


TJ/ityi /"4 eanipnc 
AlUillU odpiCUa 


nisauyi-uviN/x syninccasc 




1 AO I 




Y1 3Q1 £ 


nomo sapiens 


i-iUirf-receptor reiatea precursor ^aa 
-19 to 4525) 


/.oozy 


100 


770 




/\I auIQOpsiS 

thaliana * 


VxOmains j x\r juiktUU WL/*tu, vj-oeta 
repeat domains. 


ill 




771 
/ / i 


ARfn7f?RS 


iriUS IliUSUUlUS 


jurV£NJr**UKe proiem 


1 Z.HO 


y I 


772 


AL161578 


Arabidopsis 

+V* ali on 0 


putative protein 


335 


46 


773 


AL161578 


Arabidopsis 

+ Via 1 inn n 


putative protein 


333 


47 


774 


AY00R271 


Unmn csiniAtic 
jn\JlLX\J oapjCIib 


iiciivxisc &jyu\j&\*T\u x 




oo 


77^ 


V9 1 SO J 


T_J rtyri con VATIC 

noiijo sapiens 


ix urn an secreiea proiem ^cione 


1 1Z / 




776 


W88853 


Homo sapiens 


Polypeptide fragment encoded by 
gene 07. 


752 


100 


777 
lit 


WRRRS1 


nuinu sapiens 


roiypepuue naguieni encouea oy 
gene 89. 


/JZ 


i on 


77R 
I/O 


WRRRS1 


iioino sapiens 


foiypepuoe miguieiii encoaea oy 

rrr»n p> RO 

gene oy. 




i on 


770 


AF 196481 


Homo Qsnipnc 


JMXNVJ IlligCJ pi ULCLIlj I W X £, 


^f>44 


inn 


780 


AL035427 


Homo c^mi#*nc 




1609 


S4 


781 


AB026187 


Homo sapiens 


protocadherin-Xa 


5244 


100 


78? 


• 


T-fomo Qanipnc 
X ivjih vs 0 alliens 


T-ii ityi an cA/*rtf>tpH nrot^tn cpnupnri 1 

XxllillflU oCVIClCU UivilClil SCUUCIIU^? 

encoded hv oene 77 SPO ir> NO* R3 


1005 

1 vvX 


inn 


783 


AB027289 


Hnmo shdiptio 


cvcliTi-R hinriiniy nrntein 1 

wjr Willi i— l UUIUUIK 1/1 ULOlll 1 


5421 


ion 


784 


G02916 


Homo caniRTic 


Human secreted nrotein SPO IT) 
NO* 6997 


627 


100 


785 


AJ245822 


Homo saDiens 


tvoe I transmembrane recent or 


4560 


100 


786 


AJ245820 


Homo saDiens 


tvoe I transmembrane receDtor \ 


4624 


100 


787 


Z48042 


Homo sapiens 


GPI- an chored orotem d!37 


3340 


99 


788 


AL031782 


Homo sapiens 


dJ708F5.1 (PUTATIVE novel 
Collacen aloha 1 LIKE orotein ) 


2739 


100 


789 


AJ131245 


Homo sapiens 


Sec24B orotein 


6602 


100 


790 


AF1 07203 


Homo sapiens 


fltflxin 2-bindine orotein 


2008 


100 


791 


Y 14690 


Homo sapiens 


procollagen alpha 2(V) 


600 


34 


792 


AL031055 


Homo sapiens 


dJ28H20.2 (novel protein) 


1267 


100 


793 


Y36194 


787 


Human secreted protein 


2051 


99 


794 


AB028127 


Homo saoiens 


mannosvltransferase 


2138 


96 


795 


AC007228 


Homo sapiens 


R31665 2 


2738 


79 


796 


AL049482 


Arabidopsis 
thaliana 


putative protein 


436 


47 


797 


AC004528 


Homo sapiens 


R32184 3 


891 


91 


798 


AB037830 


Homo sapiens 


KIAA1409 protein 


7532 


100 


799 


X53793 


Homo sapiens 


5 1 half of the product is homologues 
to Bacillus subtiis SAICAR 
synthetase, 3 1 half corresponds to the 
catalytic subunit of AIR carboxylase 


2232 


100 
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SEQ 

n> 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


sriiYtJ- 

WATERMAN 
SCORE 


IDENTITY 


800 


Y99350 


Homo <uinien^ 


Human PROI 178 fl rNIH7 1 ^\ aminn 

acid sequence SEQ ID NO:33. 


1 *>4J 


100 


801 


AB042636 


T4omo Qflnipnc 
nviuu oopiviio 


junctopniiui typej 




47 


802 


AB029324 


Rattus 

• 

norvegicus 


TIP120-family protein TTP120B 


3916 


90 


803 


AB029324 


Rattus 
norvegicus 


TlP120-famiry protein TBP120B 


4961 


90 


804 


AF251040 


Homo sapiens 


putative nuclear protein 


2119 


100 




AdU3 Jzo 1 


Homo sapiens 


r-box and WD-repeats protein beta- 
TRCP2 isoform C 


2879 


100 


oUD 




Kattus 
norvegicus 


transmembrane receptor UNC5H1 

■ 


3257 


90 


oO/ 


Ar 1 18889 


Rattus 

♦ 

norvegicus 


b-tomosyn isoform 


3155 


97 


oUo 




Kattus 

« 

norvegicus 


selective LIM buiding factor 


8793 


95 


ROO 

ouy 


W iyy iy 


riomo sapiens 


Human Ksr-1 (kinase suppressor of 
Kas ). 


3939 


99 


810 


AL03I782 


Homo sapiens 


dJ708F5.1 (PUTATIVE novel 
uonagen aipna 1 JLUsJ^ protein) 


1546 


100 


511 1 

Oil 


■ 


riomo sapiens 


similar to c elegans r 1 1 A 10.5; oUto 
similarity to Z68297 (PID:gl 130619) 


Z294 


100 


812 


U83246 


Homo sapiens 


copine I 


606 


52 


O 1 J 


A"P7/17-**;7 


oauus ganus 


m^^m^r r m\mMm. *m, m m a MA 

retinovin 


945 


<*i A 

34 






Homo sapiens 


zinc finger protein 10 


1651 


93 


815 


X52332 


Homo sapiens 


zinc finger protein 10 


2423 


99 


olO 


Y 09631 


Homo sapiens 


PIBF1 protein 


2935 


mm a 

99 


817 


X71997 


Rattus 

■ 

norvegicus 


myosin I 


3883 


98 


818 


AYUU4877 


Mus musculus 


cytoplasmic dynein heavy chain * 


1 1 105 


j-\ my. 

98 


gin 


X2.I lyo 
• 


Homo sapiens 


Human cyclic nucleotide 
phosphodiester PDE8B(E) amino 
acid sequence. 


3790 


100 


oZU 


AT70R 1 0/1*7 


mus musculus 


tekun 


1 134 


a i 1 

81 


R7 1 
Oil 


/VIA/ JO 1U0 


T_T Jmm. mm mm. ^m± *a ^-*a j*« 

riomo sapiens 


d J99 8 C 1 1 . 1 (continues m 
jfcm.AJ-#4*fj as DAZoyrl4. 1) 


871 


100 


R77 


AF0777QS 


nomo sapiens 


l ur oeta receptor associatea protein- 
1 


1Q< 
JOJ 


24 






nVlll O muesli li t **■ 

ivius rn us cuius 


raaicai nringe 


14-ZZ 




874 

Oat 


1JR26Q5 i 


numo sapiens 


expressea- Aqzoo x o proicm 


l*l--»** 


yy 


825 


X77371 


Mesocricetus 

alii BUb 


COR1 


641 


78 


826 


AB014-V76 


Hnmn canipnc 

numu j»apiciu> 


isjL/VfVuo /o protein 


zyo 


*70 

/y 


827 


AL04Q733 


numu oapicila 


QJO / JrXJ . l 1/vrJVl auugcD y 






828 f 


AF222980 


n\Jlxl%J oapiCIlo 


^101*111*4^*^^1 in C j*l ■■■■j'lt^li mil i r> 1 *%**r*i^ain 

vusrupijeu m ocnizopnreiua 1 proieuj 


1 o 


i on 
1UU 


829 I 
\j ±* j 


73 1 560 


xftuinu aaU ICIlo 




IOoj 


1 OO 


830 


AF29S773 




rai guanine nucieonue uissociauon 
sum uicziur 


/ 1 / 


yy 


831 


AB041926 


Homo sapiens 


GCK family kinase MINK-2 


6866 


100 


832 


L04948 


Sacch ammvce 

WMWWUUI vlll T WW 

s cerevisiae 


mitochondrial tran snorter oroteiTi 


338 




833 


AJ007012 


Mus musculus 


Fish protein 


704 


94 


834 


Z34289 


Homo sapiens 


nucleolar phosphoprotein pi 30 


3455 


99 


835 


U10991 


Homo sapiens 


G2 


8436 


98 


836 


AF230877 


Homo sapiens 


MIP-T3 


2945 


99 


837 


X58288 


Homo sapiens 


protein-tyrosine phosphatase 


7734 


99 


838 


X56958 


Homo sapiens 


ankyrin (brank-2) 


9631 


100 


839 


AC024791 


Caenorhabditis 
elegans 


contains similarity to beta- lactamases 


370 


24 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 

X 


SMITH- 
WATERMAN 
SCORE 


% 

roENTTTY 


840 


D83197 


Homo sapiens 


onkyrin repeat protein 


802 


99 


Oil 


AF05371 1 


OCJ 111 U A 

can aria 

VUliCU Id 


n t* 11 rr\ ft 1 n mt*nt mpHmm cnHiinif 
iicuji/iJiajixcJii UfCUiUiii dui/uiui 


iy^ 




R47 


A F? 81 779 


T-fomn cnnipnc 
nvuiu cxil/i^uo 


Cimi lor ff\ U nm r\ campnc rihrtCAfrml 
oUJlllaX LvJ FlUJIltl SKlL/lCUd 1 lUwdwIIloX 

f>rf»tpin Tin pncnrfpH Hv frftnRanlf 

JIlvlvlU lilv CliWUUwU \Jj V-J wUJ^CLLUV 

Arr*p^*jion Number T^25RQ9 


oon 


1 OA 


843 


U76343 f 


Homn <$an>iens 

MX\JMAL\J JOi/AWUJ 


GARA tT^nQnort n rote in 


29Q2 


OR 


844 

Oil 


Y13645 


Wotllft carii **ti Q 




R07 

oy / 


1 UU 


R45 




T^ 11 f% Q O T"\ 1 f^Tl Q 
11U1UU OUlJiuJlo 


aillUwi iu sal gCJJCiaj UJiiuuiuijui Jai 

matrix processing protease mRNA 

^lvr\ X lVti JT J» 




OQ 

yy 


846 


AF 192522 

ill i. J * ■ * ' - £ 


Homo Q2)nien<i 

A Iwillw LSI wild 


"Miemann-Pirlc C*"\ nrotpin* NPP^ 
iiicmaiuiT i\*s\ piuicui, nrw 


7047 


i fin 


847 


AF192522 


Homo sapiens 


Niemann-Pick C3 protein; NPC3 


5472 


100 


RAR 
oto 


AUUH07 


XIUUIU Sapiens 


elongation iacior**i~oeia 


1 1 A7 
1 lOZ 


i nn 


5140 




1-IrMTin eanipne 
nuiuu SaplCHo 


X30Z / jZj? 1 


7777 


o/ 


850 


AC003682 


Homo sapiens 


R28830 1 


2401 


100 


851 


AL121583 


Homo sapiens 


bA358N2.1 (novel protein) 


353 


61 


oOx 


<64o4/D 


xiomo sapiens 


giucoKinase regulator 


3153 


no 






riomo sapiens 


dJ3 /riio.2 (anj-aomam Dinaing 
protein l) 




QQ 

yo 


o3** 




xiomo sapiens 


x'as-associaiea pnospnaiase-i 


■a on 


K 
jo 


855 


AF062741 


Rattus 

* 

norvegicus 


pyruvate dehydrogenase phosphatase 
isoenzyme z 


447 


80 


ojO 


VI 1A1 1 
x 1 111 1 


X-I ATW1 ft f OTll AM C 1 

xiomo sapiens 


pnsuinoyi-^o/v oxiaase 




OS 

70 


oc'7 
OJ / 




otrongyiocentr 

OLUu> 

purpuratus 


leKiin Ai 
















858 


AB001105 


Homo sapiens 


hippocalcin-like protein 4 


995 


100 




Ar io*t/yi 


xiomo sapiens 


putative j o. j ku a protein 


1 70^ 


i nn 
1UU 


860 


AF298117 


Homo sapiens 


homeobox protein OTX2 


1477 


93 


ool 


ArUlDZ04 


Kattus 
norvegicus 


goigi peripheral membrane protein 

pOD 


lozU 


Ol 


oOZ 


yi Aom 
l oy \j i 


riomo Sapiens 


JUKD SUOUnil 01 IVadjU / /*f 


1 7JIA 
1 itW 


i nn 


OUJ 


XVI 1 Z. 1*KJ 


xiomo sapiens 


envelope proiein 


7fl7 


R1 
0 1 


RAA 


AFIAIA^O 
AX 1 1D1HJ7 


nomo Sapiens 


uepp 1 no 


RI S 
0 1 J 


OR 
yo 


RA^ 


AT lAQQSTC 


xiomo Sapiens 


HI71RP1 111 /"nnvAl rbec IT 
C1J / 1 Of 1 1.1.1 \novci ClaSS 11 

Ommv/U cmaiClflaC alUlllol WJ gvTUlC 

palmoryltransferase (isoform 1)) 




inn 
i\j\j 


ooo 


1TI / / 1 OJ 


"R nttiiQ 

xvaiiuo 

• 

norvfcpieus. 


aijjiia** i "luaLj ugiuuumi 


727 


45 


867 


AF272663 


Homo sapiens 


gephyrin 


3785 


100 


ouo 






fi hill in-? 




87 


869 


X82494 


Homo sapiens 


fibulin-2 


3407 


99 






1 villa xnuswuiua 


tuisiiLD pro IC ill 


107 




R71 
© / 1 


A T77R1 1 1 


T-Tf*»rt"if*\ caniPnc 
XTUU1U ttHjJlCilj 


puvd|yUi/lipaoC ^ UvLa" 1 a 


£75R 

C/^^O 


00 
yy 


872 


AF073344 


Homo sapiens 


ubiquitin-specitic protease 3 


256 


43 


R7^ 
o / J 


VOl 


xiomo sapiens 


xiuman cyiosKeieion associaieci 

nmfpin 1 H ^PV^P-1 Cft 




i nn 

1UU 


874 


AJ000414 


Homo sapiens 


Cdc42-interacting protein 4 


1136 


53 


R7^ 

O /J 


AF265555 


rwuivi aa|/ibiid 


iiKiniiirin-w^niiiofltin& T-^TR— Wnmain 

enzyme APOLLON 


627 


100 ^1 


876 


Y48586 


Homo sapiens 


Human breast tumour-associated 
protein 47. 


2531 


98 


877 


AF182198 


Homo sapiens 


intersectin 2 long isoform 


8764 


99 


878 


L17308 


Gossypium 
hirsutum 


proline-rich cell wall protein 


192 


35 


879 


AF177169 


Homo sapiens 


tropomodulin 2 


1769 


100 


880 


W03627 


Homo sapiens 


Human follicle stimulating hormone 
GPR N-teraiinal sequence. 


210 


23 
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SEQ 

ILf 

NO: 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 

C/*V n kt>K > 


"- % " 
IDENTITY 


881 


AL021068 


Homo sapiens 


<U206D15.3 


2615 


99 


882 


AC005498 


Homo sapiens 


R31665 2 


11 ft 

■3 10 




883 


AF165518 


Homo sapiens 


MAGOH isoform 

**** IvVlvl III 


1 OL 


94 


884 


D21211 


Homo sapiens 


protein tyrosine phosphatase (PTP- 

BAS tvoe 31 


368 


43 


885 


U13045 


Homo sapiens 


nuclear respiratory factor-2 subunit 

beta 1 


869 


62 


886 


X52836 


Homo saniens 


trvntftnfifm hi/Hrftwlnc** f A A 1 _ /l/l /i\ 


IMA 


98 


887 


X51466 


Wfimft cnniVnc 


t* lfvn era tirtTi f ortnr O 
ciuii^auiuii laviui Z 


4460 


100 


888 

W V W 


AB039903 


Hotnrt cnnipnc 


iiiLcjrici uu~ responsive linger proiem 1 
long form 


1 AO/ 1 

1096 


98 


889 


X51760 


Homo sapiens 


zinc ringer protein (583 AA) 


3130 


100 


890 




nuuiu Sapiens 


vonage-gatea sodium cnannel oeta-3 
suDunii 


1024 


100 


891 


W67928 




rragmeni 01 nuroan secreteo protein 

f*T\f*C\(\e*t\ Kv nana A 


391 


100 


892 


AB020598 


Homo sapiens 


peptide transporter 3 


3017 


100 


893 
\j j -j 


Y66n48 

A UvlTfO 




iviexnorane-Douna protein FKv/l 12U. 


4122 


! 99 


894 


Y66648 


T-Irttvi r\ eaniAne 
JTlVMIlu 2>ojpiCIlS» 


xvicTn Drane-oouna protein JrKUi izu. 


3606 


96 


895 

• 


A29218 cd 

1 




i y-rvvj v -i o ujn/\ encoaing Vj- 
protetn coupled 7 TM receptor with 

AYOB 1 S arKvitv 

^»-rV v^XV 1 _> dUUVliV. 


2178 


100 


896 

%j ^ \r 


AJ000332 


Homo innipTiQ 


UiUl/UolUajC AX 


CA£Q 

OXJOD 


99 


\ 897 


X98259 


Homo «5afiien^ 




1AOC 

lUoD 


100 


898 


X57110 


Homo saniens 






Oft 1 

99 


899 


X63652 


Homo iarvien<? 


uud-aijjua-uypbm liuaiDuor neavy 
chain TTEH1 


JJ /o 


98 


900 


X85134 


Homo saoiens 


t? nrn^Pin diTiHincr nmfoin 
*W IJIULCiU l/JJIUUlg |/IULCUi 


Oftl iC 
ZolD 


oft 1 
99 1 


901 


LI 1672 


Homo saniens 


7~ir\c fin opt rvrAt^in 




JO | 


902 


Y85565 


Homo saniens 


UNC-53/2^ seouence 


joy 


DO 1 

od | 


903 


X54871 


Homo saoiens 


ras related nrotein RahSh 




inn 1 


904 


Z98265 


Homo sapiens 


plakophilin 3 


4065 


100 


905 


AL035295 


Homo saoiens 


hvnothetical nrntein 




oo ~n 


906 


AF051782 


Homo saoiens 


dianhanoui 1 


Jim 


J J j 


907 


AF208536 


Homo saoiens 


nucleotide binHinp' nrofptn* "NmP 


1179 


i nn 1 
1UU | 


908 


U79240 


Homo saoiens 


serine/threonine nrotein kinase 


91ISS 


Oft i 


909 


U79240 


Homo saoiens 


serine/threonine nrotein IrmnQA 




oo 1 


910 


AJ1 32545 


Homo sapiens 


protein kinase 


2921 


ioo ! 


911 


AJ 132545 


Homo saoiens 


nrotein lcinncf* 


lOJ / 


99 j 


912 


AL121733 


Homo saoiens 


hvnothetical nrntpin 




nn I 

yy j 


913 


Y67579 


Homo s aniens 


Human Hftath i n Hi i rPT*_r»Kl if ^ratm- 1 

i*lDIO-l^ nolvnentide 


1 ^ft< 


1 ft A I 


914 


X87342 


Homo saoiens ! 


Human ftiant larvae homolopiie 




oo 1 


915 


X87342 


Homo sapiens 


Human giant larvae homologue 


3495 


96 j 


916 


M94362 


Homo saoiens 


lam in B2 




oa 1 


917 


AJ011654 


Homo saoiens 

A AVU4V kJUi/ AWllW 






1 0ft 1 

100 | 


918 


AJ131899 


Rattus 
norvegicus 


proline rich synapse associated 
protein 1 

IT » 


5776 


88 


919 


AF054986 


Homo sapiens 


putative transmembrane GTPase 


1816 


100 


920 


U95822 


Homo sapiens 


putative transmembrane GTPase 


1237 


100 


921 


Y11588 


Homo sapiens 


apoptosis specific protein 


1492 


100 


922 


X84195 


Homo sapiens 


acylphosphatase 


510 


100 


923 


U72882 


Homo sapiens 


interferon- induced leucine zipper 
protein 


1409 


99 


924 


AE000660 


Homo sapiens 


hADV36Sl 


573 


100 


925 


AF126245 


Homo sapiens 


acyl-Co enzyme A dehydrogenase-8 
precursor 


2162 


100 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


Storrri- 

WATERMAN 
SCORE 


% 

IDENTITY 


926 


AE001968 


Deinococcus 
radiodurans 


hypothetical protein 


147 


27 


yZl 


Wo 1576 


Homo sapiens 


EB V-induced G-protein coupled 
receptor (EBI-2) polypeptide. 


1778 


100 


OOft 


TTA1 Q 1 T 

UU1 317 


T T «k ■ lira 4V a«b * 

Homo sapiens 


beta- gloom 


687 


94 




Ay 8333 


Homo sapiens 


organic cation transporter 


2933 


100 


930 


Y91444 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 42 SEQ ID 
NO:165. 


1401 

• 


100 


931 


Y91644 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 43 SEQ ID 
NO:317. 


1243 


100 


932 


D90279. 


Homo sapiens 


collagen alpha 1(V) chain precursor 


569 


39 


933 


Z31560 


Homo sapiens 


sox-2 


1587 


96 


934 


AF147790 


Homo sapiens 


transmembrane mucin 12 


3047 


99 


935 


Z85996 


Homo sapiens 

• 


match: multiple proteins; match: 
Q08151 P28185 Q01111 Q43554; 
match: Q08150 Q40195 P20340 
Q39222; match: Q40368 P36412 
P40393 Q40723; match: CE01798 
Q38923 Q40191 Q41022; match: 
Q39433 Q40177 Q40218 Q08146; 
match: P10949 PI 1023 Q16948 
Q20337; match: Q25389 P25228 
P20336 P05713; match: P35276 
Q08147 P17609 P22128; match: 
Q15771 P36410P35291; GTP- 
binding 


726 


94 


936 


AB041533 


Homo sapiens 


sperm antigen 


1054 


38 


937 


X91906 


Homo sapiens 


voltage-gated chloride ion channel 


3914 


— , 

100 


938 


AB03248 1 


Homo sapiens 


homeobox transcription factor 


1744 


100 


939 


AFU1106 


Homo sapiens 


protem senne/threonme phosphatase 
4 regulatory subunit 1 


4682 


99 


940 I 


Y 17999 


Homo sapiens 


DyrklB protem kinase 


3331 


99 


941 


AF305872 


Homo sapiens 


thyroglobulin 


A* ^ 

455 


92 


942 


AF263462 


Homo sapiens 


cingulin 


5939 


99 


f\ A 1 

943 


AK024442 


Homo sapiens 


FU00032 protein 


1616 ! 


61 


944 


Y3591 I 


Homo sapiens 


Extended human secreted protem 
sequence, SEQ ID NO. 160. 


262 


35 


945 


ABO 1 5320 


WT ■ 

Homo sapiens 


sigmalB subunit of AP-1 clathnn 
adaptor complex 


599 


71 


946 


Z82287 


Caenorhabditis 
elegans 


ZK550.2 


229 


35 


947 


D84223 


Homo sapiens 


leucyl tRNA synthetase 


6207 


99 


948 


U49057 


Rattus 
norvegicus 


rA9 


3846 


62 


949 


AK000568 


Homo sapiens 


unnamed protein product 


1659 


100 


950 


AL021578 


Homo sapiens 


% V A Mmsm^ M^m m *m. At 9 4 A* * * 

dJ453C12.6.1 (uncharacterized 
hypothalamus protein (isoform I)) 


257 


42 


SOI 


AB032435 


Homo sapiens 


differentiation-associated In- 
dependent inorganic phosphate 
cotransporter 


3063 


99 


952 


AFH0532 


Homo sapiens 


uncoupling protein UCP-4 


1561 


100 


953 


X83587 


Mus musculus 


1A13 protein | 


1420 


59 


954 


AL031665 


Homo sapiens 


dJ545L17.5.1 (novel protein) 


386 


53 


955 


Y87600 


Homo sapiens 


Human fatty acid synthase-like 
protein (HFASLP). A 


2377 


100 


956 


Y99421 


Homo sapiens 


Human PR01433 (UNQ738) amino 
acid sequence SEQ ID NO:292. 


522 


55 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 

sr*onF 


% 

IDENTITY 


957 


U6S535 


Mus musculus 


1 aldo-keto reductase 


451 


73 


958 


AC007067 


Arabidopsis 
thaliana 


T10O24.10 


1594 


57 


959 


U72194 


Mus musculus 


muskelin 


1947 


00 

yy 


960 


AE003661 


Drosophila 
melanogaster 


CG 15168 cene nrodn ct 


777 




961 


X80332 


Mus musculus 


rab20 






962 


Y67315 


Homo saoiens 


HllfTlflTi Qf»fTf»tf*H r*rot/»in R! 51 Q 1 *5 
AxmuatM acuicicu jyiUlciJJ DL/07 1 J 

amino acirl ^niipnrj* 




yy 


963 


Y67315 


Homo sapiens 


Human secreted protein BL89_13 

amino acirl <;pniipnr > p 

Mil 1 II IV UwlU gvUUCUVvi 


3916 


99 


964 


L32602 


Rattus 

n orveff icus 


homeodomain 159..341 


1821 


96 


965 


297832 


Homo sapiens 


dJ329A5.3 (KIAA06460 protein) 


3581 


99 


966 


W88995 




ruiypcpuuw uogmciiL encoded oy 
gene 146. 


1 /O 


il 1Q * 

39 


967 


U12465 


TTomrt can ii* tic 


iiuv/suiuai protein jlod 


OU4 


100 


968 


AF151803 


Homo sapiens 


CGI-45 protein 


1101 


78 


i 969 




numo sapiens 


oilman secreted protein encoded by 
gene u/ cionc riivivvxr 


1348 


98 


970 




nuinu oapicjis 


succinate aenyarogenase navoproiein 
subunit 


70 J 


100 


971 


AJ113521 


rifv>QrtT*Vi5 In 
JL/i \j supi J Ua 

HlI773ltM 


pruieose, reverse uanscnpiase, 

rihoTHlf*I**aCA IT infprmca 


194 


23 


972 


AC006017 

♦ 


Homo sapiens 


N-acety]galactosaminy]transferase; 


3271 


100 


973 


Z81317 


Ulilll WOllwvU CU 

omvees nombe 


j-'iN^v^*iN/vAVi / nencose ianuiy 
nrotein 


OOJ | 


1 1 
J 1 


974 


M17885 


Homo sapiens 


acidic ribosomal phosphoprotein (P0) 


792 


100 


975 


U22829 


«1 US UiUOwUiU) 




jyy 


40 


976 


AL 132772 


JUL/ iVllO 


u j iuj j/uz. 1 ^nepauc nuclear lacxor 
4, alpha) 


Z400 


yy 


977 


AC003973 


Hnmo sanien*: 

A IwlJIV j<ip IViJJ 




1 ^^n 

1 jjU 


HJ 


978 


J04031 


Homo sapiens 


MDMCSF (EC 1.5.1.5; EC 3.5.4.9; 
EC 6.3.4.3) 


2824 


63 


979 


AF136715 


Homo sapiens 


taxol resistant associated protein 


217 


76 


980 


AF136715 


Homo sapiens 


taxol resistant associated protein 


306 


95 


981 


292822 


Caenorhabditis 
elegans 


ZK520.1 

• 


1109 


44 


982 


AJ295149 


Homo sapiens 


putative dipeptidase 


1564 


99 


983 


AL021331 


Homo sapiens 


dJ366N23.3 (KIAA0173 and 
Tubulin-Tyrosine Ligase LIKE) 


1492 


100 


984 


AL161501 


Arabidopsis 
thaliana 


putative adenosine deaminase 


370 


38 



TABLE 3 



SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


2 


BL00282 


Kazal serine protease inhibitors family 
proteins. 


BL00282 16.88 4.259e-14 97-120 


3 


BL00298 


Heat shock hsp90 proteins family 
proteins. 


BL00298A 10.97 1.00Qe-4074- 
119 BL00298E 27.30 1.000e-40 
321-376 BL00298F 11.21 l.OOOe- 
40 409-464 BL00298H 20.50 
1.000e-40 553-607 BL00298C 
16.40 2.286e-40 186-230 
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SEQ 

n> 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00298B 15.64 1.290e-39 134- 
181 BL00298G 24.57 5.345e-39 
465-520 BL002981 30.07 7.81 8e- 
34 661-715 BL00298D 17.97 
6.226e-33 242-282 


4 


PR00237 


RHODOPSIN-LKE GPCR 
SUPERFAM1LY SIGNATURE 


PR00237A 11.48 4316e-13 57-82 


5 


PD02454 


!!!( PROTEIN ALU SUBFAMILY 
WARNING ENTRY NUCLEAR 
PHOSPHO. 


PD02454B 11.61 4.309e-17 75- 
103 


6 


DM00864 


EGF-LIKE DOMAIN. 


DM00864A 15.21 7.429e-09 98- 
119 


7 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237A 11.48 1.750e-l 1 29-54 
PR00237D S.94 7.000e-09 138- 
160 PR0Q237B 13.50 8.250e-09 
61-83 


9 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.667e-15 272-289 


10 


BL00139 


Eukaryotic thiol (cysteine) proteases 
cysteine proteins. 


BL00139D 9.244.400e-ll 391- 
408 BL00139A 10.29 7.51 le-09 
67-77 


12 


BL01113 


Clq domain proteins. 


BL01113B 18.26 9.294e-19 689- 
725 BL01113C 13.18 4.857e-li 
757-777 BL0U13D7.47 2.161e- 
10 790-800 


13 


BL01113 


Clq domain proteins. 


BL01113B 18.26 3.81 3e-14 599- 
635 BL01113C 13.18 4.857e-ll 
667-687 BL01113D7.47 2.161e- 
10 700-710 


14 


BL00594 


Aromatic amino acids permeases 
proteins. 


BL00594A 16.75 6.53 le- 10 50-94 


15 


BL01047 


Heavy-metal-associated domain proteins. 


BL01047B 19.73 4.913e-13 707- 
728 


16 


PR00625 


DNA J PROTEIN FAMILY 
SIGNATURE 


PR00625A 12.84 7.462e-18 310- 
330 PR00625B 13.48 3.939e-15 
340-361 


18 


BL00615 


C-type lectin domain proteins. 


BL00615A 16.68 3.700e-09 144- 
162 


20 


PR00741 


GLYCOSYL HYDROLASE FAMILY 
29 SIGNATURE 


PR00741D 16.11 9.082e-21 175- 
195 PR00741F 14.66 9.262e-21 
243-265 PR00741B 14.23 1.947e- 
18 128-145 PR00741G9.29 
2.180e-17 318-340 PR00741C 
9.16 7.328e-17 147-166 
PR00741H 10.32 2.141e-13 351- 
374 PR00741A 9.24 3.596e-13 
89-105 PR00741E 13.39 3.535e- 
12 215-232 


22 


BL00I07 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 3.647e-20 117- 
148 BL00107B 13.31 1.000e-16 

1 SO 1 0ft 


23 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 1.600e-23 126- 
157 


24 


BL00107 

* 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 1.600e-23 126- 
157 


27 


BL00239 


Receptor tyrosine kinase class II proteins. 


BLO0239B 25.15 2.324e-16 91- 
139 


28 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 3.250e-10 681-694 
BL00018 7.41 6.400e-10 717-730 


29 


BL00018 


EF-hand calcium-binding domain 


BL00018 7.41 3.250e- 10 681-694 
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SEQ 
ID 

JUL* 

NO: 


ACCESSION 
NO 


DESCRIPTION 

* 


RESULTS* 






nmtpinQ 

JUlvlwlllO* 


TH ftflfil R 7 41 ADHfOn 71*7 7**fi 






C*Aa domain nrnteinQ 




33 


PD01168 


SYNTHETASE LIGASE PROTEIN 
ALANYL. 


PD01168L9.47 1. £676-09 401- 
416 


34 


PD01168 


SYNTHETASE LIGASE PROTEIN 
ALANYL. 


PD01 1 68L 9.47 1 .667e-09 4 1 1- 
426 


36 


PR00426 


C5A-ANAPHYLATOXIN RECEPTOR 
SIGNATURE 


PR00426D 10.59 3.618e-12 110- 
122 


37 


PF00791 


Domain present in ZO-1 and Unc5-like 
netrin receptors. 


PF00791B 28.49 2.049e-10 1080- 
1135 


38 


BL00350 


MADS-box domain proteins. 


BL00350 20.79 1 .000e-40 1-55 


40 


BL00123 


Alkaline phosphatase proteins. 


BL00123B 19.31 1.000e-40 90- 
133 BL00123C 24.61 1.000e-40 
145-195 BL00123E 22.25 I.OOOe- 
40 304-358 BL00123G 26.01 
1.000e-40 438-488 BL00123F 
19.03 8.714e-35 364-399 
BL00123A 10.80 9.000e-24 52-77 
BL00123D 12.73 l.OOOe-17216- 
229 


44 

• 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 2.800e-14 346-359 
PD00066 13.92 4.600e-14 486-499 
PD00066 13.92 1.000e-l3 374-387 
PD00066 13.92 6.000e-13 458-471 

• 

PD00066 13.92 2.714e-l2 234-247 
PD00066 13.92 3.143e-12 430-443 
PD00066 13.92 8.71 4e-12 514-527 
PD00066 13.92 3.739e-ll 402-415 
PD00066 13.92 2.038e-10 318-331 


45 


DM00973 


3 kw RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE. 


DM00973 A 21.17 2.946e-10 180- 
217 


47 


BL00649 


G-protein coupled receptors family 2 
proteins. 


BL00649C 17.82 1.682e-10 475- 
501 BL00649B 20.68 7.387e-09 
417-463 


50 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 

* 


PD00066 13.92 8.200e- 16 445-458 
PD00066 13.92 5.846e-15 305-318 
PD00066 13.92 l.OOOe- 14 221-234 
PD00066 13.92 l.OOOe- 14 417-430 
PD00066 13.92 2.800e-14 249-262 
PD00066 13.92 2.800e-14 277-290 
PD00066 13.92 8.800e-14 333-346 
PD00066 13.92 9.400e- 14 361-374 
PD00066 13.92 4.000e-13 389-402 
PD00066 13.92 6.571e-l2 473-486 


51 


BL00226 


Intermediate filaments proteins. 


BL00226D 19.10 1.000e-40 417- 
464 BL00226B 23.86 3.348e-35 
251-299 BL00226C 13.23 1.429e- 
24 316-347 BL00226A 12.77 
1.857e-15 151-166 


S7 


PI? ft/i7 1 7 


SIGNATURE 


T>RfMV>17r* 111 01 < £AftA_rtQ 

149 


53 


BL00232 


Cadherins extracellular repeat proteins 
domain proteins. 


BL00232B 32.79 1.000e-40 143- 
191 BL00232A 27.72 2.350e-28 
49-82 BL00232B 32.79 7.052e-21 
252-300 BL00232C 10.65 6.625e- 
20 250-268 BL00232B 32.79 
1.3l4e-l 1367-415 BL00232C 
10.65 9.308e- 10 470-488 




BL00303 


S-100/ICaBP type calcium binding 


BL00303B 26.15 8.759e-23 125- 



153 



PCIYUS01/04098 



1 cirri 

1 10 

1 WV 


NO. 


UlCoCKUr 1 1LI IN 




* 




i pro ic in. 


1 Ox OJLUU JU J A 2 1 . / / i .uuoe-2 1 

82-119 


58 


PR00378 


INOSITOL PHOSPHATASE 
SIGNATURE 


PR00378D 16.86 1.000e-15 242- 
261 PR00378B 13.80 9.250e-13 I 
109-129 


59 


PR00425 


BRADYKININ RECEPTOR 
SIGNATURE 


PR00425C 13.23 9.040e-12 120- 
140 


60 


BL00280 


1 Pancreatic trypsin inhibitor (Kunitz) 
) family proteins. 


BL00280 24.61 6.727e-38 238-282 
BL00280 24.61 1.5 14e-30 294-338 


65 


BL01019 


j ADP-nbosylation factors family proteins. 


BL01019A 13.20 1.222e-ll 43-83 


68 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMELY SIGNATURE 

1 


PR00237E 13.03 5.09Ie-13 188- 
212 PR00237G 19.63 7.207e-13 
268-295 PR00237A 11.48 4375e- 
11 24-49 PR00237C 15.69 
3.057e-10 101-124 PR00237D 
8.94 4.750e-10 137-159 
PR00237F 13.57 5.364e-10 230- 
255 PR00237B 13.50 9.438e-10 
57-79 


70 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.938e-28 31-70 


71 


PR00830 


ENDOPEPTIDASE LA (LON) SERINE 
PROTEASE (SI 6) SIGNATURE 


PR00830A 8.41 8.759e-12 348- 
368 


72 


BL00120 


Lipases, serme proteins. 


BL00120B 11.37 2.149e-10 148- 
163 


77 


PR00753 


j 1 - AMLNOC YCLOPROPANE- 1 - 
CARBOXYLATE SYNTHASE 
SIGNATURE 


PR00753E 8.01 3.552e-ll 191- 
216 PR00753D 6.85 2.778e-09 
131-153 


78 


PR00506 

• 


| D21 CLASS N6 ADENINE-SPECIFIC 
DNA MJiTJrl Y H KAJN or iiKA a K 
SIGNATURE 


PR00506C 19.40 8.017e-09 96- 

t 1 o 

1 19 


82 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 3.571e-16 436- 
467 


84 


BL00675 


Sigma-54 interaction domain proteins 
ATP-binding region A proteins. 


BL00675A 24.86 8.800e-10 256- 
300 


85 


BL00027 | 


Homeobox domain proteins. 


BL00027 26.43 2J286e-30 1 17-160 


87 


BL00250 


TGF-beta family proteins. 


BL00250A 21.24 6.786e-36 264- 
300 BL00250B 27.37 l.450e-26 
328-364 


91 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 9.250e-17 10-35 
i3LUUzl5A 13. oz D.UUUe-lo 221- 
246 BL00215A 15.82 7.857e-12 

11 168-181 




15JUJUU2 / 1 


riomeoDOx domain proteins. 


xJijUuuz/ zo.*ij y.ozoe-z^- j2*h^o/ 


so 


rKUUl)94 J 




i*K.uuuy4u iz.y^ i.uuue-uo iiy- 
136 


96 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO. 


PD02327B 19.84 2.09 le-09 143- 
165 


0*7 


di Arn^o i 
JdIAHJ/DZ i 


AJr A proiein. 


OT AA7<7D 1Q 177 7QQa AO *5B *70 i 


98 


PR00876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876B 7.66 2^68e-10 135- 1 
149 


99 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9.824e-12 122- 
141 


100 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 7.429e-31 118-161 


101 


BL00028 


Zinc finger, C2H2 type, domain proteins. 


BL00028 16.07 6.870e-12 370-387 
BL00028 16.07 6.885e-ll 398-415 
BL00028 16.07 8.269e-ll 342-359 
BL00028 16.07 4.300e-10 229-246 j 



154 
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SEQ 
ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00028 16.07 6. lOOe- 10 258-275 


102 


PR00048 

* 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.750e-14 665- 
679 PR00048A 10.52 8.500e-14 
581-595 PR00048A 10.52 9.250e- 
14 637-651 PR00048A 10.52 
2.059e-12 609-623 PR00048A 
10.52 2.588e-12 469-483 
PR00048A 10.52 7.353e-12 553- 
567 PR00048A 10.52 2.895e-ll 
525-539 PR00048A 10.52 4.3 16e- 
11 441-455 PR00048A 10.52 
5.263e-l 1413-427 PR00048B 
6.02 2.125e-10 569-579 
PR00048B 6.02 4.93 8e- 10 513- 
523 PR00048A 10.52 5.696e-10 
497-51 1 PR00048B 6.02 8.875e- 
10 429-439 PR00048B 6.02 
1.000e-09 457-467 PR00048B 
6.02 6.684e-09 485-495 


103 


PR00195 


DYNAMIN SIGNATURE 


PR00195A 1 1 .94 5.364e-22 31-50 
PR00195B 9.47 1.783e-21 56-74 
PR00195C 11.50 3.455e-21 126- 
144 PR00195D 11.76 8.714e-21 
175-194 PR00195F 16.20 8.500e- 
20 217-237 PR00195E9.82 
8.650e-20 194-211 


1 t\A 

104 


BL01113 


Clq domain proteins. 


BL01113A 17.99 1.865e-09 121- 
148 BL01 1 13 A 17.99 5.846e-09 
82-109 


105 


BL00420 


Speract receptor repeat proteins domain 
proteins. 


BL00420A 20.42 6.400e-l 1 70-99 
BL00420A 20.42 8.525e-10 73- 
102 BL00420A 20.42 5.708e-09 
85-1 14 


108 


PR00860 


VERTEBRATE METALLOTHIONEIN 
SIGNATURE 


PR00860B 7.04 2.929e-20 27-41 
PR00860A 5.46 5.500e-165-18 
PROOooOC 9.61 1.474e-14 41-51 


i to 
1 JZ 


T>T A 1 /*kO i 

oJUOlOJl 


Heat shock nspzO protems iamily pronle. 


BL01031C 17.68 6.400e-10 122- 
147 


114 

t 


DM01840 


lew SPAC24B1 1.09 R07E5.13. 


DM01 840B 22.04 2.688e-40 59- 
103 DM0 1840A 10.95 9.57 le-13 
31-43 


115 


BL0U26 


Elongation factor Ts proteins. 


BL01126A 18.48 2.317e-30 46-89 
BL01126B 13.15 7.387e-19 116- 
135 BL01 126C 9.20 9.735e-l 1 
190-203 


1 K 
1 10 




Sugar transport proteins. 


BLUOzIod 27.54 4.375e-21 35-85 


118 


BL00437 


Catalase proximal heme-ligand proteins. 


BL00437A 18.82 1.000e-40 49- 
101 BJL00437B 16.28 1.000e-40 
114-lQo BJL00437C 21.86 l.OOOe- 

1.000e-40 248-301 BL00437E 
23.95 1.000e-40 327-379 


119 


BL00140 


Ubiquitin carboxyl-terminal hydrolase 
family 1 cysteine activ. 


BL00140D 22.64 8.274e-14 164- 
208 BL00140C 1 1.80 5.444e-10 ! 
77-102 


120 


BL00224 


Clathrin light chain proteins. 


BL00224B 16.94 6.712e-10 95- 1 
148 


122 


BL00203 


Vertebrate metallomioneins proteins. 


BL00203 13.94 1.000e-40 16-62 


123 


PR00041 


CAMP RESPONSE ELEMENT 


PR00041D7.95 2.906e^)9 24-41 
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n> 

NO* 


NO. 










BINDING (CREB) PROTEIN 
SIGNATURE 




124 

* 


PR00041 


CAMP RESPONSE ELEMENT 
BINDING (CREB) PROTEIN 
SIGNATURE 


PR00041D 7.95 2.906e-09 24-41 




BL0Q061 


Short-cnain aenyarogenases/reauctases 
family proteins. 


BL00061C 7.8o 3.250e-10 212- 
222 


126 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.400e-25 251-290 


127 


PR003 1 8 


ALPHA G-PROTEIN (TRANSDUCIN) 
SIGNATURE 


PR00318D 16.28 1.900e-34 219.- 
248 PR00318B 14.79 3.455e-27 
168-191 PR00318C 12.09 7.000e- 
23 1 97-2 1 5 PR003 1 8 A 7.84 
1.600e-l9 35-51 PR00318E7.23 
2.500e-12 265-275 


128 


PR00927 


ADENINE NUCLEOTIDE 
TRANSLOCATOR 1 SIGNATURE 


PR00927E 14.93 9.743e-10 67-89 
PR00927B 14.66 4.575 e-09 69-91 


130 


BL00824 


Elongation factor 1 beta/betaVdelta chain 
proteins. 


BL00824B 9.21 7.750e-22 133- 
153 


131 


BL00824 


Elongation factor 1 beta/beta'/delta chain 
proteins. 


BL00824C 14.58 1.000e-40 166- 
204 BL00824D 14.04 1.621e-38 
204-239 BL00824B 9.21 7.750e- 
22 133-153 BL00824E 12.49 
1.000e-l9 247-263 


132 


PR00209 


ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B 4.88 9.222e-13 1209- 
1228 


133 


PR00209 


ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B 4.88 9^22e-13 1 168- 
1187 


S 134 


PR00708 

• 


ALPHA- 1 -ACID GLYCOPRO I fcIN 
SIGNATURE 


PR00708D 14.67 1.000e-27 141- 
168 PR00708C 11.77 1.643e-25 
98-120 PR00708B 15.15 2.1 74e- 
24 73-95 PR00708E 13.33 
1.600e-21 189-207 PR00708A 

1 A AH O £Q4£a_'?1 ^1 *7fl 


135 


PR00109 


TYROSINE KINASE CATALYTIC 

T"fcfYK Jt A TXT QT/TM A TI TD 17 

DOMAIN MUNA1UKD 


PR00109B 12.27 8.468e-l3 126- 

1 AC 

143 


136 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 3.250e-10 201- 
217 


137 


BL00471 


Small cytokines (intercrine/chemokine) 
C-x-C subfamily signat 


BL00471 23.92 7.480e-10 42-90 


140 


PR00205 


CADHERIN SIGNATURE 


PR00205B 1 1.39 5.582e-10 328- 
346 PR00205B 1 1.39 9.01 8e-10 
543-561 


141 


BL00412 


Neuromodulin (GAP-43) proteins. 


BL00412D 16.54 7.704e-09 976- 

1 AIT 

1027 


143 


PR00979 


TAFAZZIN SIGNATURE 


PR00979E 10.83 5.950e-26 192- 
214 PR00979A 11.91 8.773e-25 
63-83 PR00979C12.16 6.400e-I9 
108-124 PR00979D 12.38 7.955e- 

1 Q 1 Trt 1 Oc DD Artrt TOT* T rt 1 A 

3.382e-15 230-244 PR00979B 
15.59 5.636e-15 94-106 


145 


DM00686 


kw REPLICATION REP 28K 17.7K. 


DM00686C 14.14 7.720e-09 111- 
131 


146 


PR00604 


CLASS IA AND IB CYTOCHROME C 
SIGNATURE 


PR00604D 15.86 1.000e-17 87- 
104 PR00604B 12.73 9.591e-16 
57-73 PR00604C 10.21 8.200e-12 
73-84 PR00604E 10.13 1.000e-ll 
106-117 PR00604A 11.13 8.800e- 
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CTTC\ 

n> 

NO: 


NO. 




D17CI It rmnA 






• 


11 44-52 PR00604F 8.60 l.OOOe- 
10 123-132 


147 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 3.864e-15 266- 
297 BL00107B 13.31 6.143e-ll 
335-351 


148 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 8.448e-09 67-81 


149 


PR00069 


ALDO-KETO REDUCTASE 
SIGNATURE 

• 


PR00069D 19.36 l.857e-30 187- 
217 PR00069A 16.01 7.429e-25 
41-66 PR00069E 18.14 3.100e-22 
235-260 PR00069C 16.03 7.000e- 
20 151-169 PR00069B 11.33 
8.071e-19 101-120 


150 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 2.688e-27 139-182 


151 


PD02906 


SYNTHASE I PSEUDOURIDYLATE 
PSBUDOURIDINE LYASE TR. 


PD02906C 24.17 7.070e-22 165- 
200 PD02906B 15.35 8.393e-15 
114-127 PD02906A 10.84 6.500e- 
09 71-84 


153 


BL00479 


Phorbol esters / diacylglycerol binding 
domain proteins. 

p 


BL00479A 19.86 5 .09 le- 12 891- 
914 BL00479B 12.57 1.837e-ll 
915-931 


ICO 

158 


BL00027 


'Homeobox domain proteins. 


BL00027 26.43 6.786e-31 143-186 [ 


160 


BL00422 


Granins proteins. 


BL00422C 16.18 7.750e-12 420- 

A A A 

448 


162 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625A 12.84 9.297e-l 1 62-82 


164 


BL01282 


BER repeat proteins. 


BL01282B 30.49 6.182e-10 347- 
386 


166 


PR00860 


VERTEBRATE METALLOTHIONEIN 
SIGNATURE 


PR00860B 7.04 2.929e-20 83-97 
PR00860A 5.46 1.000e-18 61-74 
PR00860C9.61 1.900e- 15 97-107 


167 


PR00449 


TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 


PR00449A 13.20 7.052e-09 196- j 
218 ! 


169 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 1.346e-39 316- 
353 BL00514G 15.98 2.241e-34 
471-501 BL00514H 14.95 6.571e- 
27 510-535 BL00514E 14.28 
1.273e-16 388-405 BL00514D 
15.35 9.100e-15 369-382 ■ i 
BL00514B 16.42 4.857e-14 260- 
276 BL00514F 1 1.65 9.690e-14 
4 1 6-43 1 BL005 14A 1 1 .68 8.200e- 
1 1 149-159 


170 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 1.346e-39 268- 
305 BL00514G 15.98 2.241e-34 
423-453 BL00514H 14.95 6.571e- 
27 462-487 BL00514E 14.28 
1. 273 e-1 6 340-357 BL00514D 
15.35 9.100e-15 321-334 
BL00514B 16.42 4.857e-14 212- 
228 BL00514F 11.65 9.690e-14 
368-383 BL00514A 11.68 8.200e- 
11 101-111 


171 

-. 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514G 15.98 2.241e-34 385- 
415 BL00514H 14.95 6.57 le-27 
424-449 BL00514C 17.41 4.632e- 
24 230-267 BL00514E 14.28 
1.273e-16 302-319 BL00514D 
1535 9. lOOe- 15 283-296 
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m 

NO: 


NO 












228 BL00514F 1 1.65 9.690e-14 
Hn-14^ "RT OfVtldA 11 fift oonno 

11 101-111 






'UrtwiArt^fw' nnmam nrntAtne 

rrojiiwuooiv uuiuaio proiciiis. 


tit nnnoo A"X o /inn* oo i io 


174 


DM01970 


0 kw ZK632. 12 YDR3 1 3C 

¥7TvTr>OQOA>f Al ITT 
xl IN l^^a WiVL/vL» iXX. 


DM01970B 8.60 5.1 19e-15 1391- 


176 


BL00773 


Chi tin as es family 19 proteins. 


BL00773C 9.42 8.000e-09 2-16 


1 AO 


nn a A 1 AO 


TYRvJolNiS JsJLNAoc LAI AJLY IIC 

DOMAIN SIGNATURE 


PK00109B 12.27 9.1o3e-14 141- 
160 


1517 


T>T\A 1 A1 «7 

FLJU1937 


DNA rKvJ I tilN FUJL YMiiRAais 
ENDONUCLEASE DNA-. 


PD01937A 0.08 3.475e-09 221- 
232 


1 fi< 


TUT rt/\o A C 

BL0O845 


CAP-GIy domain proteins. 


BL00845 16.43 2.946e-23 247-272 
BL00845 16.43 1.628e-21 107-132 


186 


PR00452 


SH3 DOMAIN SIGNATURE 


PRD0452B 11.65 6.53 8e- 1 1 525- 
541 


187 


PR00452 

a 


SH3 DOMAIN SIGNATURE 


PR00452B 1 1.65 6.538e-l 1 497- 
513 


"too 

188 


DM01803 


1 HERPESVIRUS GLYCOPROTEIN H. 


DM01 803 A 10.51 1.000e-O9 
1081-1102 


189 


PF00651 


BTB (also known as BR-C/Ttk) domain 
proteins. 


PF00651 15.00 5.091e-15 69-82 

* 


190 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194C 6.38 1.900e-35 145- 
1 74 PR00 1 94E 8 .74 3 .250e-30 
231-257 PR00194D9.57 1.500e- 
26 175-199 PR00194B 10.24 
5.200e-24 120-141 PR00194A 
7.86 4.857e-21 84-102 


192 


PD02042 


IRON-SULFUR ELECTRON 
TRANSPORT AROMATIC 
HYDROCARB. 


PD02042B 16.75 5.154e-09 131- 
1 46 PD02042A 21.13 5.909e-09 
94-121 


193 


PR0002 1 j 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


nn /\A/IO f A A 1 t 1 1AA«* I rt<1 if 

PR00021A 4.31 2.200e-IO 2-15 


1 ac 


BL00463 


Fungal Zn(2)-vJys(6) oinuclear cluster 
domain proteins. 


n? AAi/<] o T"> C ATI* A A 111 1 0"J 

BL00463 8.22 5.07le-09 111-123 


196 


PR00H8 


BETA-LACTAMASE CLASS A 
SIGNATURE 


PR00118F 16.42 9.386e-09 165- 
181 


197 


DM002 15 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 5.424e-09 234- 
267 


1 0ft 


0.LUU00U 


U -fc^* jfl 1 ^Vkw%>% tiff ^1 n i mm tJtfjM«iiA4MA 

oano 4.1 ianuiy domain proteins. 


JdIAIUODU A j i .jU j . DUUe- 11/ 14- 

767 


1 OQ 




KazaJ serine protease lnniDitors iamiiy 
proteins. 


oLUUxoZ lo.oo o.o2Ue-13 7U-V3 




Tjn AA AAA 


TYPE I Etxr olUN A 1 URb 


Tjn A AAA A A 1 A 1<C<3>I<a 1 C A*7 1 

FK0U0U9A J 4. 1 5 5.34 je-1 5 9 / J - 
987 PR00009C 14.11 8.773e-13 

CiQ< 1 AAQ DDAnnAon lKOI 
990-lUUo x*lvUUUU9LI 1O.03 ■ 

8.000e-ll 1008-1018 PR00009C 
14.11 l.882e-09 892-904 


203 


BL00025 


P-type 'Trefoil 1 domain proteins. 


BL00025 17.17 4.536e-19 38-59 




bi rinm ft 


1717 hanrl /«a Tr*YiiiTi_hiTi/f irto rlrtmoin 

jDr-nanu caiwuia-Diiiuijjg aoinaui 
proteins. 


rt nnni 8741 7 inn a. in i<?^«i7ft 


206 


PR00168 


SLOW VOLTAGE-GATED 
POTASSIUM CHANNEL SIGNATURE 


PR00168D 12.88 6.865e-ll 67-86 


207 


BL00025 


P-type Trefoil 1 domain proteins. 


BL00025 17.17 3.423e-20 39-60 
BL00025 17.17 8.750e-16 88-109 


209 
210 


BL00646 
PR00138 


Ribosomal protein S13 proteins. 
MATRJDON SIGNATURE 


BL00646B 21.42 6.100e-30 110- 
143 BL00646A25.82 6.192e-29 
14-62 

PR00138D 16.56 3.605e-25 279- 
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305 PR00138C 16.41 3.000e-24 
218-247 PR00138E6.018.7I4e- 
13 314-328 PR00138A 15.14 
9.538e-13 134-148 PR00138B 
15.82 4.522e-12 188-204 


211 


DM01206 

• 


CORONA VIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 8.429e-12 386- 
406 DM01206B 10.69 1.247e-10 
384.404 DM01206B 10.69 
5 . 06 8e- 10 388-408 


212 


PD01941 


TRANSMEMBRANE 
COTRANSPORTER SYMP. 

■ 


PD01941A 14.81 l.OOOe^O 163- 
217 PD01941B 15.02 9.705e-30 
420-467 PD01941E 15.92 8.714e- 
23 837-884 PD01941C 19.96 
8.200e-20 508-563 PD01941D 
27.18 1.600e-16 661-710 
PD01941F 28.52 9.645e-15 1005- 
1060 


213 


BL00362 


Ribosomal protein SI 5 proteins. 


BL00362 24.67 8.3 l3e-09 330-373 


214 


BL00115 


Eukaryotic RNA polymerase II 
heptapeptide repeat proteins. 


BL00115Z3.12 2.125e-09 1178- 
1227 BL00115Z3.12 6.096e-09 
1164-1213 


215 


BL00038 


Myc-type, 'helix-loop-helix* dimerization 
domain proteins. 

• 


BL00038B 16.97 7.600e-18 125- 
146 BL00038A 13.61 1.474e-13 
102-118 


216 


BL01108 


Ribosomal protein L24 proteins. 


BL01 108A 20.33 2.241e-22 49-82 
BL0U08B 1 1.40 8.457e-10 96- 
107 


217 


PR00381 


KINESIN LIGHT CHAIN SIGNATURE 


PR00381A9.55 1.321e-10 360- 
378 


222 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 2.358e-26 1166- 
1203 BL00514G 15.98 9.000e-15 
1289-1319 BL00514D 15.35 1 
6.936e-12 1207-1220 BL00514F 
1 1.65 4.288e-10 1253-1268 
BL00514H 14.95 8.636e'-10 1318- 
1343 


223 


BL00325 


Actin-depolymerizing proteins. 


BL00325B 21.66 1.000e-40 93- 
139 BL00325A 24.83 9.333e-24 
61-93 


224 


BL00018 


EF-hand calcium-binding domain 

* • 
proteins. 


BL00018 7.41 1.450e-l 023 1-244 


225 


PF01329 


Pterin 4 alpha carbinolamine dhydratase. 


PF01329B 18.52 1.692e-18 67-92 


228 


BL00211 


ABC transporters family proteins. 


BL00211B 13.37 6.250e-18 1033- 
1065 BL00211B 13.37 8.875e-18 
2045-2077 BL00211A 12.23 
1.900e-09 931-943 


230 


PR00761 


BINDIN PRECURSOR SIGNATURE 


PR00761A 5.81 9.366e-09 275- 
292 


231 


PR00049 


WILMS TUMOUR PROTEIN 

CTSTM A *T*I TT> T7 


PR00049D 0.00 3.500e-I0 54-69 


232 

■ 


BL00412 


Neuromodulin (OAP-43) proteins. 


BL00412D 16.54 1.978e-10 109- 
160 BL00412D 16.54 4.I22e-09 
133-184 


233 


BL01210 


Caveolins proteins. 


BL01210B 13.92 8.129e-09 106- 
156 


236 


BL00939 


Ribosomal protein Lie proteins. 


BL00939F 17.27 5.393e-09 861- 
891 1 


238 


BL01252 


Endogenous opioids neuropeptides 
precursors proteins. j 


BL01252D 18.25 3.571e-28 205- 
233 BL01252B 19.09 5.034e-27 
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J7-67 BL0I252C 18.10 1.621e-2I 
164-190 BL01252A 14.22 7.107e- 

ID 1/1 1A 


239 


BL0O3O2 


Eukaryotic initiation factor 5A hypusine 
proteins. 


BL00302 14.81 1 .000e-40 25-79 


240 


PR00420 


AROMATIC-RING HYDROXYLASE 
(FLAVOPROTEIN 
MONOOXYGENASE) SIGNATURE 


PR00420A 14.78 8.851e-13 26-49 


241 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I 


PD02929A 28.27 4.529e-09 235- 
289 


243 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 8.527e-25 11-50 


244 


BL01270 


Band 7 protein family proteins. 

• 


BL01270C 16.91 6.745e-17 1 15- 
144 BL01270B 18.74 6.857e-17 
76-115 BL01270E 13.03 6.0 16er 
15 182-211 BL01270D20.87 
9.160e-13 144-182 


245 


YS'W t A mm mmm *m 4 

PF00791 


Domain present in ZO-1 and Unc5-like 
netrin receptors. 


PF00791B 28.49 6.305e-12 253- j 
308 PF00791B 28.49 1.909e-ll 
427-482 PF00791B 28.49 2.651e- 
09 179-234 PF0079 IB 28.49 
3.890e-09 112-167 


246 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 2.500e-13 277-290 
PD00066 13.92 9.143e-12 193-206 
PD00066 13.92 5.304e-ll 165-178 
PD00066 13.92 6.478e- 11 249-262 
PD00066 13.92 3.423e-10 221-234 


«■« m\ mm 

247 


BL00406 


Actins proteins. 


BL00406D 12.58 6.400e-20 465- 
520 BL00406B 5.47 4.857e-14 
249-304 BL00406E 8.44 l.OOOe- 
1 1 522-572 BL00406C 6.75 
5.449e~ll 313-368 


248 


BL00951 


ER lumen protein retaining receptor 
proteins. 


BL00951C 19.35 1 .000e-40 1 12- 
161 BL00951A 15.10 7.750e-39 
21-57 BL0095 ID 13.94 6.000e-38 
161-196 BL00951B 14.23 3.100e- 
31 57-88 


252 

• 

* 


BL01113 


Clq domain proteins. 


BL01113A 17.99 9.129e-15 200- 
227 BL011 13A 17.99 4.818e-14 
194-221 BL01113A 17.99 7.81 8e- 
14 182-209 BL01113A 17.99 
1.730e-13 185-212 BL01113A 

mm mmf m*m. mm ^ mm mm ^ m mm mm* mm mm mi ^mm m\ mmu. 

17.99 6.595e-13 191-218 
BL01113A 17.99 6.077e-12 203- 

^% f% m^ m f m +A.' m F m^k 4-^4 mm m. m\ mm mm mm. — » « mm mm, mm mm 

230 BL01113A 17.99 9.182e-ll 
179-206 BL01113A 17.99 2.532e- 
10 176-203 BL01113A 17.99 

m\ m\ m m— mm mm mmm mt m% mm* a mm m\ mm, m m% mi mm m\ 

9.043e-10 2 18-245 BL01113A 
17.99 9.426e-10 209-236 

164 


257 


BL00845 


CAP-Gly domain proteins. 


BL00845 16.43 1.837e-21 466-491 


259 


PR00248 


METABOTROPIC GLUTAMATE 
GPCR SIGNATURE 


PR00248G 12.67 2.688e-09 53-78 


260 

* 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3.400e-l 0441-452 
BL00678 9.67 5.800e-10481-492 J 
BL00678 9.67 8.800e-l 0358-369 


261 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3.400e-10 415-426 
BL00678 9.67 5.800e-10 455-466 



160 



WO 01/57190 



PCTAJS01/04098 



SEQ 
ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 

• 










262 


BL00678 


Tip-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3.400e-10 468-479 

DLUUO/O 7.D( J.OUUC*lU JU0OI7 

BL00678 0 67 8 SflOe-IO IRC 


263 


BL50002 


Src homology 3 (SID) domain proteins 
profile. 


BL50002B 15.18 2.200e-104I5- 
429 


264 


BL00049 


Ribosomal protein LI 4 proteins. 


BL00049C 17.38 3.040e-12 94- 
130 


265 


A U\J 1 toy 


VJx»» I V/V/rl\U X Ct JIN rlvU 1 HUN 


rDU14o9 20.39 Z.Q91e-14 438-470 


266 


PD01469 


GLYCOPROTEIN PROTEIN 
rivxsv^ u ivo wjcv. o/\. 


PD01469 20.59 2.091e-14 279-31 1 


267 


BL00567 


Phosphoribulokinase proteins. 


BL00567A 10.66 1.161e-12 36-55 


269 


BL00049 


Ribosomal protein LI 4 proteins. 


BL00049C 17.38 2.688e-28 92- 
128 BL00049B 18.42 6.806e-24 
54-86 BL00049A 13.86 833 3 e- 19 
19-42 BL00049D 13.47 5.765e-12 
129-140 


272 


BL01115 


OTP-binding nuclear protein ran proteins. 


BL01115A 10.22 9.735e-12 14-58 


273 


PR00021 


SMALL PROLINE-RICH PROTEIN 

QT/TKJ A T*T TOTT 

MON A I UKJb 


PR00021A4.31 1.9Ue-09 819- 
832 


275 


PR00179 


LIPOCALIN SIGNATURE 


PR00179B 9.56 2.895e-13 124- 
137 PR00179A 13.78 3.250e-ll 
36-49 PR00179C 19.02 6.040e-ll 
154-170 




rKUU449 


IKANohUKMiNG PROTEIN P21 RAS 
SIGNATURE 


PR00449A 13.20 8.364e-17 22-44 
PR00449C 17.27 1.000e-13 62-85 
PR00449E 13.50 4.000e- 12 172- | 
195 PR0Q449B 14.34 5.680e-10 
45-62 


271 


RL00140 


uDiquiuii cdTDoxyi-teniimai nyoroiase 

X<UUlJjr 1 cyoLcmc ocuv. 


"Di AAi/iAr^ 00 £/i i nnn« .ha 1^1 
x3LUU14UL> Zz.o4 I.OOUe-40 161- 

. *>AC QT Art 1 >f f\r* 1 1 OA A AC1„ -J A 

79-104 BL00140A 15.96 9.400e- 
28 5-35 BL00140B 12.29 4.649e- 

I / J r"JJ 


278 


PD02712 


ELEMENT TRANSPOS ASE FOR 
TRANSPO^ON TRAtt^Pn^Am F 


PD02712A 23.03 8.013e-09 47-83 


279 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 1.474e-09 100-111 


282 


DM00892 1 


3 RETRO VI RAT PROTFT>JA<5F 


L/MUUoyx^ zJJj /D/e-Xl 504- 

898 


! 283 


BL00048 


Protamine PI nmtpinQ 


RT nnO/lft A 1Q 0 CCA- AO <iC ci 


286 


PR00081 


GLUCOSE/R1BITOL 
DEHYDROGENASF FAMTT V 

SIGNATURE 


PR00081A 10.53 1.878e-ll 36-54 


287 


PR00310 


ANTI-PROLIFERATTVE PROTEIN 

FtTGl FAMTT Y RTOMATTrRF 


PR00310B 10.59 4^3 le- 17 29-59 
jtxxuuj ivu zr.i\j o.o/ye-io oy-j iy 


289 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.000e-36 37-76 


293 


BL00979 


G-protein coupled receptors family 3 
Droteins. 


BL00979L 20.63 3.800e-12 111- 


295 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD0241 1 21.89 7.000e-16 195-229 


296 


BL01064 


Pyridoxamine S'-phosphate oxidase 
proteins. 


BL01064A 27.84 8.3 13e-28 77- 
129 BL01064C 15.22 7.136e-25 
202-235 


297 


BL00030 


Eukaryotic RNA -binding region RNP-1 
proteins. 


BL00030A 14392.929e-13 37-56 
BL00030B7.03 l.900e-ll 167- 
177 BL00030A14.39 2.000e-10 
128-147 
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TD 

NO: 


NO 


TVtf C/^D n>TTAM 


KIloUIjTS 


998 




n Hi /PDH mpthvltrancfAraco fUmiKi 

uuiej iiicuiyiuauoieiase ramny 
proteins. 


25j_.u 1 1 ojt> z i . j i o.ooue-iz I*frJ- 
188 


299 


BL01279 


Protein-L-isoaspartate(D-aspartate) O- 
methyltransferase signa. 


BL01279A 24.27 5.862e-l 1 57- 
105 


301 


BL00191 


Cytochrome bS family, heme-binding 
domain proteins. 


BL00191K 17.38 4.951e-27 184- 
228 BL00191J 1 1.37 6.447e-17 
128-1 50 




r\\ifnftQoo 


1 DCTDn\7TD AT TJUATnTVT A Or 1 

d Kb 1 KL) VIRAL PROTEINASE. 


DM00892C 23.55 3. 893 e- 16 33-67 


306 


PF01140 


Matrix protein (MA), pi 5. 


PF01140D 15.54 2.988e-09 416- 
451 


307 

* 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 4.818e-21 59-81 
PR00245C 7.84 5.154e-20 238- 
254 PR00245D 10.47 4.000e-15 
274-286 PR00245B 10.38 8.200e- 
15 177-192 PR00245E 12.40 
5.714e-12 291-306 


309 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 2.245e-10 612-658 


310 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 7.632e-23 119- 
159 BL00237C 13.19 3.864e-15 
251-278 BL00237D 1 1.23 3.739e- 
12 312-329 


311 


BL00380 


Rhodanese proteins. 


BL00380D 15.90 8.200e-28 1 10- 
136 BL00380G 11.26 5.800e- 16 
267-280 BL00380B 14.77 7.000e- 
14 49-62 BL00380F 9.76 5.886e- 
13 203-214 BL00380C 15.67 
7.387e-13 82-98 BL00380E 12.44 
7.000e-ll 1S1-193 BL00380A 
10.48 1.000e-09 10-20 


312 


BL00227 


Tubulin subunits alpha, beta, and gamma 
proteins. 


BL00227B 1 9.29 1 .000e-40 50- 
105 BL00227C 25.48 1.000e-40 
111-163 BL00227D 18.46 l.OOOe- 
40 220-274 BL00227F 21.16 
1.000e-40 372-426 BL00227A 
24.55 3.250e-39 1-35 BL00227E 
24.15 8.5 00e-34 324-359 


327 


BL00232 


Cadherins extracellular repeat proteins 
domain proteins. 


BL00232B 32.79 7.362e-21 225- 
273 BL00232B 32.79 2.588e-17 
435-483 BL00232B 32.79 6.30 le- 
15 116-164 BL00232B 32.79 
6.769e- 13 330-378 BL00232C 
10.65 9.341e-12 223-241 
BL00232C 10.65 5.696e-ll 328- 
346 BL00232C 10.65 3.942e-10 
433-451 


329 


PD02749 


TRANSCRIPTION PROTEIN FACTOR 
BTF3 REGULATION NUCL. 


PD02749B 12.75 2.241e-37 35-71 
PD02749C 13.96 4.892e-28 87- 
121 PD02749A 9.56 6.000e-15 2- 
15 




DTJ ftftlAI 


DuncDU a trnvT txt/"\ o ttvm 1 
r'HUorrlA 1 XXJ Y LtNUM 1 UL 

TRANSFER PROTEIN SIGNATURE 


ddaaioit; 11 gft 1 HOCa Kill 

231 PR00391B8391.000e-13 
83-104 PR00391D 12.21 9.328e- 
13 191-207 PR00391A7.83 
5.390e-l 1 16-36 


332 


BL01030 


RNA polymerases M / 15 Kd subunits 
proteins. 


BL01030 23.44 1.818e-23 87-125 


337 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.929e-32 6-45 


1 340 


PD02711 


SYNTHASE 


PD0271 IB 14.26 1.973e-20 944- 
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SEQ 
ED 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PHOSPHORIBOSYLFORMYLGLY. 


968 


343 


BL00223 


Annexins repeat proteins domain 
proteins. 


BL00223C 24.79 1 .000e-40 245- 
300 BL00223B 28.47 8.714e-38 
168-218 BL00223A 15.59 8.250e- 
27 98-132 BL00223A 15.59 
o.750e-27 26-60 BL00223C 24.79 
9.438e-l6 13-68 BL00223C 24.79 
2./3je-13 o5-14U £>LUU223A 

15.59 2.253e- 11 258-292 


346 




QTATUlV/TW PA \A\\ V Q T/TW A TT TO T* 

♦ 


rK0Uj45B 7.12 2.800e-2o 81-1 10 
PR00345E 8.54 7.652e-28 158- 
183 PR00345C 4.54 9.100e-28 
110-134 1*K0U345D 1U.97 1.964e- 
24 134-158 PRD0345A 13.46 
5.645e- 16 52-71 


347 


BL00586 


Ribosomal protein L16 proteins. 


BL00586B 17.00 3.215e-15 184- 
221 


348 


PR00388 


S'.S'-CYCLIC NUCLEOTIDE CLASS II 
PHOSPHODIESTERASE SIGNATURE 


PR00388A 10.45 2.778e-09 86- 
105 




bt Ann 1 c 


Jir-nand calcium- binding domain 
proteins. 


BL00018 7.41 3.118e-ll 160-173 
BL00018 7.41 2,350e-10 244-257 




"Dl AA<*7© 


l rp-Asp (WD) repeat proteins pro terns. 


BL00678 9.67 1. 947 e-09 256-267 


358 


DM01206 


CORONA VIRUS NUCLEOCAPSID 

TJD ATCTVI 

rKOlcIN. 


DM01206B 10.69 3.278e-09 175- 
195 DM01206B 10.69 6.696e-09 
183-203 DM01206B 10.69 
8.633e-09 132-152 DM01206B 
10.69 8.861e-09 181-201 
DM01206B 10.69 9.3 16e-09 177- 
197 


361 


PD01498 


OXIDASE BIOSYNTHESIS 
OXIDOREDUCTASE PORP. 


PD01498C 24.90 6.880e-14 219- 
263 


362 


PD01498 


OXIDASE BIOSYNTHESIS 
L)XlJL)UKbJJ ULJ Aon rVxsJr. 


PD01498C 24.90 6.880e-14219- 
263 


365 


BL00178 


Aminoacy 1-transfer RNA synthetases 
ciass-i protems. 


BL00178B7.il 1.000e-ll 589- 
600 BL00178A 14.23 8.500e-09 

4000 


366 


BL00523 


Sulfatases proteins. 


BL00523E 19.27 1.000e-23 318- 
348 BL00523A 13.36 5.500e-16 

78-90 BL00523C 12.64 9.625e-13 
129-140 BL00523G 9.46 5.500e- 
10 506-516 


369 


BL00107 


nuivui imioaca f\ i r "Uiiimug region 

proteins. 




370 


BL00880 


xvi^y i 'U.r\-uiiiuLLig proicin. 


Til nnQCA 1 1 «;o l nnno^i a n< 
oJUUUooU 1 /.32 1.UUUG-4U 


371 


BL00107 


Protein kinases ATP-binding region 
.proteins. 


BL00107A 18.39 1.000e-23 276- 

1A*7 DT AA1ATO 11*31 1 rQT« 1**> 

342-358 


372 




T TTT7T TM < CTl r J'\I A IT TO T7 
1 J3JH1N oIVjriN/\ 1 UxvC 

• 


rKUUZl 1±> O.oo o.oU2e-l 1 326- 
347 PR00211B0 86 6 106e-10 
320-341 PR00211B0.86 3.167e- 
09 333-354 


373 


BL00279 


Membrane attack complex components / 
perforin proteins. 


BL00279E 37.1 1 9.349e-10 749- 
797 


375 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 1.231e-33 10-49 


377 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.563e-28 10-49 


379 


BL00598 


Chromo domain proteins. 


BL00598 14.45 5.781e-16 3-25 
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SEQ 

TT"V 

NO: 


ACCESSION 


DESCRIPTION 


RESULTS* 




PKUU*MJ * 


rU\±AJAK^LD 

DEHALOGENASE/EPOX1DE 

UVT^UfYI ACT? CAXiffTT V OlPXIATf IDC 


PR00413D 11.28 8.941e-09 864- 
878 


383 


PR00413 


HALOACID 

HYDROLASE FAMILY SIGNATURE 


PR00413D 11.28 8.941e-09 864- 
878 


Jo/ 




riageiia transport protein flu* family 
proteins. 


BL01060A 15.65 1.535e-09 131- 
174 


TOO 


PRUOZOy 


ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B 4.88 6.318e-1 1 1009- 
1028 


389 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


PR00837B 11.64 1.000e-10 469- 
483 


391 


BL00240 


Receptor tyrosine kinase class III 
proteins. 


BL00240B 24.70 7.907e-10 118- 
142 


392 


PR00014 


FJDBRONECTIN TYPE III REPEAT 
SIGNATURE 


PR00014D 12.04 8.412e-10 691- 
706 


393 


PR00014 

• 


*IJtfKUJNEUTlN TYPE ill REPEAT 
SIGNATURE 


PR00014D 12.04 8.412e-10 706- 
721 


394 


BL01209 


LDL-receptor class A (LDLRA) domain 
proteins. 


BL01209 9.31 3368e-15 47-60 
BL01209 9.31 5.500e-13 92-105 


395 


BL00634 


Rioosomal protein L30 proteins. 


BL00634 34.38 4.090e-13 70-121 


396 


BL01013 


Oxysterol-binding protein family 
proteins. 


BL01013D 26.81 8.000e-26 358- 
402 BL01013A25.14 7.231e-21 
45-81 BL01013C9.97 l.000e-13 
132-142 BL01013B 11.33 1.000c- 
11 110-121 


397 


BL00930 


Peripherin / rom-1 proteins. 


BL00930E 17.80 1.000e-40 56-92 
BL00930D 9.12 4.632e-37 12-56 
BL00930F 16.91 2.800e-36 92- 
133 


400 


PR00780 


LEUSERP1N 2 SIGNATURE 


PR00780B 4.89 4.491e-09 262- 
285 


401 


PR00819 


CBXX/CFQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 7.158e-ll 4-20 


403 


BL00381 


Endopeptidase CIp serine proteins. 


BL00381C 23.84 1.250e-32 150- 
194 BL00381A 16.48 2.286e-22 
74-111 BL00381B 21.42 8.326e- 
14 78-130 


405 


BL01105 


Ribosomal protein L35Ae proteins. 


BL01105A 1737 1.000e-404-49 
BL01105B 12.95 1.000e-4068- 

t Art 

108 


406 


BL00344 


GATA-type zinc finger domain proteins. 


BL00344 17.99 7.000e-12 814-852 


407 


PR00211 


GLUTEHN SIGNATURE 


PR0021 IB 0.86 9.750e-09 73-94 


409 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A2.51 4.321e-09 9-22 


410 


BL00762 


WHEP-TRS domain proteins. 


BL00762A 23.43 1.000e-28 752- 
789 BL00762A 23.43 4.400e-21 
903-940 BL00762A 23.43 5.415e- 
18 825-862 BL00762B 16.14 
8.759e-12 1154-1168 


412 


BL00690 


DEAH-box subfamily ATP-dependent 
helicases proteins. 


BL00690B 13.38 5.320e-15 262- 
280 BL00690A6.87 1.818e-13 
230-240 


415 


BL00227 


Tubulin subunits alpha, beta, and gamma 
proteins. 


BL00227B 19.29 1.000e-40 52- 
107 BL00227C 25.48 1.000e-40 
113-165 BL00227D 18.46 l.OOOe- 
40 222-276 BL00227F 21.16 
1.000e-40 382-436 BL00227E 
24.15 1.750e-34 326-361 \ 
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SEQ 
m 

WO' 


ACCESSION 

MO 


DESCRIPTION 


RESULTS* 








BL00227A 24.55 1.000e-33 1-35 


A 1 A 


Kruuyyz 


i roponm. 


Fr 0095/2 A lD.OV 1.7Ile-09 557- 
592 




T>1 AAC/t 1 
-t$lA»l/j4 1 


XNuciear transition protein l proteins. 


BL00541 8.44 9.875e-09 256-3 10 




£JJLU104 1 


XNuciear transition protein 1 proteins. 


BL00541 8.44 9.875e-09 197-251 


4ZU 




ot 1 oomam proteins. 


PF00856A 26.14 9.074e-13 901- 
938 PF00856B 16.42 2.397e- 12 
951-973 


421 


B LOO 678 


Trp-Asp (WD) repeat proteins protems. 


^Vf A A a**1 t\ +*\ a^++ A AAA. *i A A ^ m - ■ ~v\ 

BL00678 9.67 8.200e-12 33-44 


423 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 8.600e-30 130-169 


424 


PF00564 


Octicosapeptide repeat proteins. 


PF00564B 24.74 1.305e-I7 421- 
472 


426 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.569e-12 3-21 


427 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.569e-12 3-21 


428 


BL00478 


LIM domain proteins. 

• 


BL00478B 14.79 3.250e-13 115- 
130 BL00478B 14.79 9.036e- 13 
50-65 


431 


BL00282 


Kazal serine protease inhibitors family 
proteins. 


BL00282 16.88 8.875e~12 464-487 


432 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 7.800e-18 316- 
357 PD00930A 25.62 9.617e-12 
125-151 PD00930B 33.72 2.52 le- 
10 214-255 


433 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 4.649e-34 34-73 


434 


PR00449 


TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 


PR00449A 13.20 7.563e-ll 56-78 


436 


PR00120 


H+-TRANSPORTING ATPASE 
(PROTON PUMP) SIGNATURE 


PR00120C 9.90 5.800e-19 705- 
722 


437 


BL00115 


Eukaryotic RNA polymerase II 
heptapeptide repeat proteins. 


BL001 15T 8.45 7.273e-29 1208- 
1242 BL00115Q 18.08 2.776e-21 
953-983 BL001 15Y 1 1.86 8.000e- 
17 1604-1650 BL00115M 19.19 
8.130e-16 731-774 BL00115H 
14.34 9.392e-16 463-496 
BL00U5A 15.44 7.4 14e- 15 43-82 
BL001 15R 6.50 6.128e-14 983- 
1010 BL00115J 16.71 9.289e- 14 
591-617 BL00115I 8.33 4.336e- 
13 535-590 BL00115L 12.25 
5.939&-13 662-694 BL00115G | 
1 1.65 6.01 le-13 435-463 
BL00115K 15.03 3.417e-10 617- 
659 BL00115O 16.76 5.805e-10 
863-913 BL00115P 11.54 7.538e- 
10 913-953 BL00115S 18.24 
7.968e-10 1010-1052 BL00115U 
10.34 4.475e-09 1242-1265 


43 o 


T> c Af\ /COO 

rrOUozo 


PHl>-finger. 


PF0Q628 15.84 4.536e-10 219-234 i 


440 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.351e-34 10-49 


441 


PR00309 


ARRESTIN SIGNATURE 


PR00309A 9.68 5.250e-24 32-55 
PR00309D 7.09 4.938e-23 290- 
309 PR00309B 7.81 2.800e-21 
69-88 PR00309C8^2 1.621e-19 
165-183 PR00309E 9.82 9.43 8e- 
15 374-389 


442 


BL00600 


Aminotransferases class-IE pyridoxal- 


BL00600B 19.60 7.324e-14 103- 
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n> 

NO: 


NO. 


UB^>t_KJr 1 ION 


T>T7 CI TT TC* 






phosphate attachment si. 


129 BL00600G 12.43 2.1 25e- 12 
306-325 BL00600F 8.77 8.105e- 
12 271-284 BL00600E 16.43 
3.167e-l 1228-257 BL00600D 
8.71 8.650e-09 207-221 


443 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 


BL00972A 1 1.93 3.160e-18 69-87 


444 


BL00349 


CTF/NF-I proteins. 


BL00349A 10.07 1.000e-40 8-54 
BL00349C9.33 1.000e-40 82-125 
BL00349E 10.79 1.000e-40 152- 
195 BL00349F 11.81 1.000e-40 
213-255 BL00349H 15.70 7.3 87e- 
36361-399 BL00349B 10.51 
2.227e-34 54-82 BL00349D 11.70 
9.100e-34 125-152 BL00349G 
19.72 5.78 le-30 323-356 


A a\ m 

445 


BL00154 


E1-E2 ATPases phosphorylation site 
proteins. 


BL00154F 8.23 8.941e-21 271- 
295 BL00 1 54E 20.37 2.620e-l 5 
124-165 


a\ At r% 

448 


DM00215 


PROLINE-RICH PROTEIN 3. 


AW 4 *V A\*% A\ A A^^ 4 4 A/% 4 « ^ 

DM00215 19.43 4.882e-l 1 82-1 15 
DM00215 19.43 6.492e-09 87-120 


451 


BL01283 


T-box domain proteins. 


BL01283A 24.15 3.100e-40 112- 
160 BL01283D 1 1 .70 6.000e-39 

A ***S A / TM A 1 A A 1 *™» A A * A ^ ^ A A 

253-286 BL01283B 23.17 6.538e- 
38 170-212 BL01283C 13.05 
7.750e- 19 222-236 


452 


PR00420 


AROMATIC-RING HYDROXYLASE 
(FLAVOPROTEIN 
MONOOXYGENASE) SIGNATURE 


PR00420A 14.78 2.579e-ll 3-26 


453 


PR00162 


RIESKE 2FE-2S SUBUNIT 
SIGNATURE 


r\ A M AT^ 1 A A A -<i A A 1 A A 1 ^ 

PR00 162B 1 2.77 7.429e- 17 215- 
228 PR00162A9.35 2.324e-14 
193-205 PR00162C8.10 7.120e- 
14 227-24U 


454 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.000e-30 87-126 


456 


BL00027 


Homeobox' domain proteins. 


BL00027 26.43 y.333e-lo 1 14y- 
1192 


457 


FD0 1 066 


PKOlblN ZINC rlNCjBK Z1NC- 
FINGER METAL-BINDING NU. 


Y5"T\ft 1 r\CC in >I1 O , 70'7i, 1>1 1 


42> y 




immunogio Dunns ana major 
histocompatibility complex proteins. 


DlA3\)jS,y\fj\ zAj.oy i.ozye-14 1j4- 
177 BL00290B 13.17 9.000e- 12 

X X4-ZJZ 


460 


PR00413 


HALOACID 

DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY SIGNATURE 


PR00413F 14.91 7.333e-ll 193- 
214 PR00413E 15.78 5.714e-09 
175-192 


463 


PR00759 


BASIC PROTEASE (KUNITZ-TYPB) 

TXTtlTO TT/^D T? A It ATT "V CTPVT A Tf TOC 


PR00759B 11.26 8.3 85e-09 74-85 


466 


BL00019 


A ctin in-type actin-binding domain 
proteins. 


BL00019D 15.33 4.200e-19 300- 
330 






/\ciuun-type aviin-pinQing domain 
proteins. 


330 


469 


PR00153 


C YCLOPHILIN PEPTIDYL-PROLYL 
CIS-TRANS ISOMERASE 
SIGNATURE 


PR00153D 11.99 3.250e-15 510- 
523 PR00153C 11.01 4.682e-14 
495-511 PR00153E9.10 8.548e- 
14 523-539 PR00153B 11^7 
1.720e- 13 452-465 


470 


BL00491 


Aminopeptidase P and proline 
dipeptidase proteins. 


BL00491C 12.15 3.912e-09 557- 
572 


471 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 


PD00289 9.97 1.000e-14 1482- 
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DOITCVM A 

rKJco Yin A. 


1496 PD00289 9.97 8.650e-U 
1122-1136 


474 


BL50040 


Elongation factor 1 gamma chain profile. 


BL50040D 17.41 1 .000e-40 279- 
329 BL50040E 18.79 1.000e-40 
333-388 BL50040F 18.99 5.320e- 
40 390-428 BL50040C 22.62 
3.739e-38 141-184 BL50O4OB 
13.65 7.000e-30 59-85 BL50040A 
12.98 1.450e-14 10-22 


A T f 

475 


BL01 144 


Ribosomal protein L31e proteins. 


BL01144 25.07 1.000e-40 22-74 


476 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007C 15.60 2.421e-21 589- 
611 PR00007B 14.16 3.500e-21 
544-564 PR00007A 19.33 6.897e- 
20 517-544 PR00007D 9.64 
6.571e-12 623-634 


All 


BL50002 


Src homology 3 (SH3) domain proteins 
profile. 


BL50002A 14.19 5.846e-10 170- 
189 


479 


DM01970 


0kwZK632.12YDR313C 
ENDOSOMAL ID. 


DM01970B 8.60 9.500e-17 967- 
980 


480 


PR0O868 


DNA-POLYMERASE FAMILY A (POL 
I) SIGNATURE 

* 


PR00868C 13.76 5.688e-17 284- 
308 PR00868A 16.33 3.1 86e-13 
224-247 PR00868H 12.51 3.388e- 
13 431-448 PR008681 10.87 
7.938&-11 462-476 PR00868E 
13.19 1.608e-10 340-366 


481 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.182e-22 53-96 


A 

482 


T AAA/* *l 

BL00061 


Short-chain dehydrogenases/reductases 
family proteins. 


BL00061B 25.79 3.647e-21 188- 
226 


483 


BL50002 


Src homology 3 (SH3) domain proteins 
profile. 


BL50002A 14.19 1.750e-12 1032- 
1051 


485 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 9.625e-10 760- 
776 PF00023A 16.03 3.571e-09 
715-731 


486 


PD02870 


RECEPTOR INTERLEUKIN- 1 
PRECURSOR. 


PD02870B 18.83 9.262e-20 103- 
136 PD02870D 15.74 9.426e-09 
201-236 


487 


PR00370 


FLAVIN-CONTAINING 
M0N0OXYGENASE (FMO) 
SIGNATURE 


PR00370G 10.45 3.769e-28 471- 
493 PR00370B 10.91 1.000e-24 
27-46 PR00370C 12.72 4.000e-21 
140-157 PR00370E 11.96 9.229e- 
21 320-339 PR00370D 16.33 
1.750e-20 185-204 PR00370F 
17.75 7.395e-20 375-395 
PR00370A 3. J 5 2.03oe-18 4-20 


489 


PD01675 


GLYCOPROTEIN MAJOR ENVELOPE 
PROBABLE U3. 


PD01675C 19.89 2.330e-10 55-89 


492 


BL00211 


ABC transporters family proteins. 


BL00211A 12^3 5.050e-09 45-57 


493 


BL00211 


ABC transporters family proteins. 


BL00211A 12^3 5.050e-09 45-57 


494 


BL00211 


ABC transporters family protems. 


BL00211A 12.23 5.050e-09 58-70 


495 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.786e-12 509-552 
BL00027 26.43 9. 143e-12 319-362 
BL00027 26.43 2.600e-ll 627-670 
BL00027 26.43 3.625e-10 779-822 


497 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 5.800e-22 214- 
245 BL00107B 1331 1.000e-13 
281-297 BL00107A 18.39 3.520e- 
13 583-614 BL00107B 13.31 
8.6 15e- 12 652-668 


499 


BL00383 


Tyrosine specific protein phosphatases 


BL00383E 10.35 1.000e-14 1902- 
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proteins. 


1913 BL00383D 11.92 3.077e-14 
1862-1875 BL00383A 13.34 
5.500e-14 1730-1745 BL00383C 
10.10 2.000e-13 1785-1796 
BL00383F 15.51 9.069e-12 1940- 
1956 BL00383B7.61 1.692e-ll 
1755-1764 


501 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.360e-09 136- 
150 PR00019A 11.19 1.667e-09 
91-105 PR00019B 11.36 4.600e- 
09 160-174 


503 


BL00226 


Intermediate filaments proteins. 


BL00226D 19.10 L000e-40 367- 
414 BL00226B 23.86 6.143e-27 
195-243 BL00226A 12.77 7.840e- 
14 96-111 BL00226C 13.23 
2.600e-13 309-340 BL00226C 
13.23 6. 1 43 e- 12 266-297 
BL00226B 23.86 1.209e-09 146- 
194 


505 


PD02407 


3 -BISPHOSPHOGL YCERATE- 
INDEPENDENT PHOSPHOGLYCER. 


PD02407F 7.61 6.739e-09 916- 
930 


506 


PF00632 


HECT-domain (ubiquitin-transferase). 


PF00632C 20.66 9.830e-19 991- 
1023 PF00632B 18.45 1.155e-ll 
940-968 


507 


BL01082 


Ribosomal protein L7Ae proteins. 


BL01082 20.37 4.273e-20 76-116 


508 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 2.421e-09 493-504 


509 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 2.421e-09 473-484 


510 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 

• 


PR00320B 12.19 4.774e-ll 567- 
582 PR00320B 12.19 5.886e-10 
763-778 PR00320C 13.01 6.760e- 
10 567-582 PR00320A 16.74 
7.6 18e- 10 846-861 PR00320A 
16.74 3.415e-09 763-778 
PR00320A 16.74 o.2ooe-U9 567- 
582 


51 1 


BL00479 


Pnorbol esters / diacylglycerol binding 
domain proteins. 


Jt5L0047yC lZ.Ul J .23 lie- lz 1 /U- 

183 


CIO 

512 


BL50058 


CJ -protein gamma su burnt profile. 




513 


BL00524 


Somatomedin B domain proteins. 


BL00524A 9.65 8.925e-14 80-101 


515 


BL0004 1 


Bacteria] regulatory proteins, araC family 
proteins. 


qt nArt / 1 o*a no i esc a ^ 10 /loo CO/4 
BL00041 23.99 1.9o4e-19 4y2024 


516 


PD00066 


PROTEIN ZINC-FINGER METAL- 
B1NDI. 


PD00066 13.92 8.500e-13 39M04 


517 


BL00415 


Synapsins proteins. 


BL00415E 4.82 9.291e-09 959- 
996 


518 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9.471e-12 126- 
145 


519 


BL00290 


Immunoglobulins and major 
histocompatibility complex proteins. 


BL00290B 13.17 4.750e-09 47-65 


522 


rK00505 


D12 CLASo No AJJJbIN LNJb-o rUClr 
DNA METHYLTRANSFERASE 
SIGNATURE 


rKUUDUDA 14.13 7.12oe-Uy 304- 

381 


525 


BL00312 


Glycophorin A proteins. 


BL00312B 9.22 5.781e-10 891- 
920 


528 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.500e-32 16-55 


529 


PR00254 


NICOTINIC ACETYLCHOLINE 
RECEPTOR SIGNATURE 


PR00254D 15.50 4.000e-17 131- 
150 PR00254A 1 1.23 4.706e-14 
61-78 PR00254C 1 1.36 4.000e-12 
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113-126 PR00254B 12.97 1. 486e- 
1195-110 


53 1 


BL00741 


Guanine-nucleotide dissociation 
stimulators CDC24 family sign. 


BL00741B 14.27 6.870e- 16 787- 
810 


532 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 3.143e-34 447- 
476 PR00193C 12.60 7.632e-32 
216-244 PR00193B ll.697.750e- 

4"^k ^k 4 4* 4 4k4k tf^h^k A f\ 4 ^k #k k 4 m a — 

29 167-193 PR00193A 15.41 
2.588e-22 111-131 PR00193E 
19.47 2.200e-21 501-530 


533 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870B 18.83 5.596e-09 348- 
381 


535 


PR00683 


SPECTRIN PLECKSTRIN 
HOMOLOGY DOMAIN SIGNATURE 


PR00683D 15.87 2.452e-10 465- 
484 


536 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.684e-24 164-207 


538 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1 .58 2.739e-09 225- 
237 


539 


BL00406 


Actins proteins. 


BL00406C 6.75 1.000e-40 157- 
212 BL00406B5.47 6.143e-37 
90-145 BL00406D 12.58 4.600e- 
36 291-346 BL00406E 8.44 
2.200e-33 364-414 BL00406A 
9.95 4.441e-23 7-42 


540 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 9.625e-10 44-59 


54 1 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 9.625e-10 44-59 


542 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 7.857e-ll 138- 
154 


544 


PF00642 


Zinc finger C-x8-C-x5-C-x3-H type (and 
similar). 


PF00642 1 1.59 9.082e- 10 838-849 


546 


BL00383 


Tyrosine specific protein phosphatases 
proteins. 


BL00383E 10.35 4.1 15e-10 104- 
115 


547 


BL01226 


Hydrox>roemylglutaryl-coenzyme A 
synthase proteins. 

■ 


BL01226A 13.79 1.000e-40 50-89 
BL01226C 13.51 LOOOe-40 127- 
167 BL01226D 11.60 1.000e-40 
174-210 BL01226E 13.74 l.OOOe- 
40 212-253 BL01226H 17.74 
1.000e-40 386-434 BL01226I 
25.06 1.000e-40 460-508 
BL01226G 15.76 3.483e-32 292- 
321 BL01226B 13.35 1.818e-31 
95-127 BL0 1226F 9.78 8.7 14e-23 
253-271 


549 


BL00964 


Syndecans proteins. 


BL00964B 12.05 2.426e- 10 1246- 
1289 


55 1 


DM01930 


2 kw FINGER SMCX SMCY 
YDR096W. 


DM01930E 15.41 1.367e-37 170- 
215 DM01930F 14.16 8.232e-28 
267-303 DM01930B 19.86 
9.163e-10 37-71 


552 


BL00195 


Glutaredoxin proteins. 


BL00195B 15.31 7.158e-09 9-29 


554 


BLO0383 


Tyrosine specific protein phosphatases 
proteins. 


BL00383E 10.35 2.756e-12436- 
447 


555 


PR00403 


WW DOMAIN SIGNATURE 


PR00403B 12.19 7.612e-ll 122- 
137 PR00403A 16.82 3.9 12e- 10 
107-121 PR00403B 12.19 2.068e- 
09 76-91 


558 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 2,71 4e-26 76-98 
PR00380D 9.93 3.000e-24 275- 
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297 PR00380C 13.18 5.1 54e-20 
226-245 PR0038QB 12.64 9.400e- 
20 195-213 


1 ^ " * 


S dt nnc i o 

BLOOM 8 


Zinc linger, CoriU4 type (KINO linger), 
proteins. 


BL00518 12.23 5. 333e-09 522-531 


[ 56 J 


! PD01795 


PRO'l'JbviN AMlNUrUP IXUASia 
PRECURSOR HYDROLASE SIGNA. 


PD0I795B 1 I.56 2.333e-I2 159- 
172 PD01795A 10.27 1.000e-09 
135-144 


562 


PD01795 


PROTEIN AMINOPEPTIDASE 
PRECURSOR HYDROLASE SIGNA. 


PD01795B 11.56 2.333e-12 110- 
123 PD01795A 10.27 1.000e-09 
86-95 


563 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 1.391e-09 41-54 


565 


BL00348 


p53 tumor antigen proteins. 


BL00348F 23.19 4.143e-09 188- 
231 


567 


PD00301 


PROTEIN REPEAT MUSCLE 

A\ T /Iff Tfc M 

CALCIUM-BI. 


PD00301B 5.49 4.1 15e-09 284- 

jm± > x am 

295 


569 


PF00850 


Histone deacetylase family. 


PF00850E 8.88 6.553e-21 756-782 
PF00850D 14.76 1.519e-16 722- 
746 PF00850F 15.70 1.118e-ll 
794-827 PFOO850G 22.75 8.375e- 
1 1 833-875 


570 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 4.960e-10 137-151 


571 


BL00518 


Zinc ringer, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 8.800e-l 1 44-53 


573 


BL00299 


Ubiquitin domain proteins. 


BL00299 28.84 1.123e-ll 123-175 


574 


1 PF01 140 


Matrix protein (MA), pi 5. 


PF01140D 15.54 3.700e-10 986- 
1021 


576 


BL00284 


Serpins proteins. 


v Mm. mm ^m. mm m mvmi mm. mm mm m* Mm mm. mm mm Mm*, mm Mm* mm. mm 

BL00284C 28.56 5.200e-26 200- 
242 BL00284A 15.64 4.913e-18 
71-95 BL00284B 17.99 7.26 le- 15 

4 4mm mm m /\ m Tm W mm A/\ ATI A ▼"*** \ m~ A A 9 m> m* 

173-194 BL00284D 16.34 5.846e- 
13 306-333 BL00284E 19.15 
7.429e-12 387-412 


579 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.553e-29 15-54 


I CQA 


BL50001 


Src homology 2 (SH2) domain proteins 
proiue. 


BL500ulo 17.40 4.5U0e- 12 1010- 

1 All 


581 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 3.189e-22 608- 
649 PD00930A 25.62 6.806e-17 
505-531 


584 


BL00612 


Osteonectin domain proteins. 


BL00612B 11.35 2.034e-ll 93- 
lzo 


585 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 8.859e-10 102- 
122 


586 


PF00628 


PHD-finger. 


PF00628 15.84 3.455e-12 235-250 


587 [ 


BL00027 


'Homeobox domain proteins. 


BL00027 26.43 6.063e-10 85-128 


588 


PR00326 


GTP1/OBG GTP-BINDING PROTEIN 

t? A X/TTT V ClrtVATITUU 
J* X olvJlN/V 1 UJKJtl 


PR00326A 8.75 7.525e-16227- 
z**o JtKuujxol* y. /y o. /oue-i j 
276-292 PR00326D 19.09 6.657e- 
13 293-312 PR00326B 16.74 
9.229e-13 248-267 


589 


BL00422 


Gran ins proteins. 


BL00422A 28.34 7.429e-09 2349- 
2378 


590 


BL00415 

» 


Synapsins proteins. 


BL00415N 4J29 9.794e-10 295- 
339 


591 


BL00128 


"Alpha-lactalbumin / lysozyme C proteins. 


BL00128A 20.76 3.423e-13 35-65 
BL00128C 19.34 2.980e- 11 110- 
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132 


596 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 3. 136e-09 3 1-46 


597 


DM00547 


1 kw CHROMO BROMODOMAIN 
SHADOW GLOBAL. 


DM00547C 17.30 1.667e-19 207- 
229 DM00547E 13.94 6.200e-18 
3 1 9-342 DM00547B 1 1 .28 

I. 000e-17 179-193 DM00547D 

I I, 60 9.250e-13 289-303 
DM00547F 23.43 6.727e-12 679- 
726 DM00547A 12.38 4.81 8e-ll 
158-170 


600 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 1.882e-27 13-52 


601 


BL00192 


Cytochrome b/b6 heme-ligand proteins. 


BL00192A 1 1 .90 6.400e-09 390- 
430 


602 


BL00936 


Ribosomal protein L35 proteins. 


BL00936B 27.27 8.615e-09 1 18- 
157 


603 


BL00936 


Ribosomal protein L35 proteins. 


BL00936B 27.27 8.615e-09 118- 
157 


606 


PR00019 


LEUCINE-RJCH REPEAT 
SIGNATURE 


PR00019B 1 1.36 7.300e-10 292- 
306 PR00019A 11.19 5.667e-09 
323-337 


607 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.300e-10 292- 
306 PR00019A 11.19 5.667e-09 
323-337 


608 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320C 13.01 9.500e-12 168- 
183 PR00320A 16.74 2.853e-10 
60-75 PR00320A 16.74 4.706e-10 
14-29 PR00320C 13.01 5.320e-10 
60-75 PR00320C 13.01 5.680e-10 
14-29 PR00320A 16.74 6.049e-09 
217-232 PR00320B 12.19 8.875e- 
09 168-183 


610 


BL00750 


Chaperonins TCP-1 proteins. 


BL00750B 16.17 1.000e-40 70- 
120 BL00750A 20.07 6.21 le-37 
26-69 BL00750G20.12 8.800e-31 
431-471 BL00750F18.40 5.125e- 
30 370-411 BL00750E 24.59 
8.650e-29 295-332 BL00750H 
21.44 1.000e-27 489-524 
BL00750C 25.65 5.345e-17 149- 
181 BL00750D 16.16 6.318e-14 
203-222 


613 


BL00766 
• 


Tetrahydrofolate • 
dehydrogenase/cyclobydrolase proteins. 


BL00766B 24.49 1.000e-40 142- 

Bl ^"B ^B W. *W J% ^B H Jf" aB B B B W ^B B B B 

190 BL00766E 13.78 1.000e-40 
322-359 BL00766C 25.86 5.500e- 
39 208-256 BL00766D 17.05 
4.536e-26 283-313 BL00766A 
21.48 6.063e-24 102-132 






Adipokineuc normone family proteins. 


BLQQ250 12.28 3.29oe-I0 746-755 


616 


BL00319 


Amyloidogenic glycoprotein extracellular 
domain proteins. 


BL00319C 17.12 9.053e-09 419- 
453 


617 


BL00030 


Eukaryotic RNA-bmdmg region RNP-1 
proteins. 


BL00030A 14.39 4.429e-09 44-63 


618 


BL00030 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030A 14.39 4.429e-09 44-63 


620 


BL00325 


ActinHdepolymerizing proteins. 


BL00325B21.665.817e-16 77- 
123 


622 


BL00972 


Ubiquitin carboxyl-tenninal hydrolases 


BL00972A 1 1.93 5.500e-19 213- 
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* 






family 2 proteins. 


23 1 BL00972D 22.55 2.742e-l 6 
501-526 BL00972B 9.45 l.OOOe- 

3.160e-l 1 370-385 BL00972E 

in *7*> *7 < i ia i n <a q 
ZU. JJL /.0 1 /e- J U jz004o 


625 


PD01066 


PROTEIN ZINC FINGER ZINC- 


PD01066 19.43 6.333e-39 6-45 


628 


BL00039 


DEAD-box subfamily ATP-dependent 
helicases proteins. 


BL00039D 21.67 7.750e-31 478- 
524 BL00039A 18.44 2.000e-25 
198-237 BL00039C 15:63 1.844e- 
15 327-351 BL00039B 19.19 
5.636e- 14 242-268 


630 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e-12 232- 
246 


631 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e-12290- 
304 


633 


A. A*>» A 

BL00785 


5 -nucleotidase proteins. 

■ 


W\ W A A.*V A ^% #V ^ ^ A ^A ^ — ^ ^ 1 A. A 

BL00785C 9.45 3.625e-16 108- 
122 BL00785E15.85 4.000e-16 
279-295 BL00785A 9.73 6.500e- 
14 29-40 BL00785B 10.65 
5.500e-13 72-86 BL00785D 9.89 

a aaaa n iic tvic 


636 


PR00832 


PAXILLIN SIGNATURE 

« 


PR00832E 14.43 9.901e-14 85- 
108 


637 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 6.362e-13 221- 
240 


638 


PF00635 


MSP (Major sperm protein) domain 
proteins. 


PF00635B 15.84 4.900e-ll 463- 
502 


639 


PR00860 


VERTEBRATE METALLOTHIONEIN 
SIGNATURE 


PR00860B 7.04 1.900e-18 85-99 
PR00860C 9.61 1.474e- 14 99-109 
PR00860A 5.46 1.720e- 14 63-76 


641 


PD00066 

- 

• 


PROTEIN ZINC-FINGER METAL- 
BINDI. 

• 


PD00066 13.92 4.462e-15 271-284 
PD00066 13.92 4.462e-15 299-312 
PD00066 13.92 2.800e-14 327-340 
PD00066 13.92 2.800e-14 383-396 
PD00066 13.92 2.800e-14 41 1-424 
PD00066 13.92 7.000e-14 355-368 
PD00066 13.92 8. 800e- 14 439-452 
PD00066 13.92 8.800e-14 495-508 
PD00066 13.92 1.500e-13 551-564 
PDOuUoo 13.92 7.000e-13 4o7-4oU 
PD00066 13.92 7.000e-13 523-536 
PD00066 13.92 9.500e-13 215-228 

irJJUUUOD 13.7Z y.DUUe-1 J 24J-ZDO ; 

PD00066 13.92 8.615e-10 607-620 


642 


BL00961 


Ribosomal protein S28e proteins. 


BL00961B 1 1.24 7.429e-37 67- 

inn T5T A O On A <V70<k_0< 

42-66 


643 


BL00585 


Ribosomal protein S5 proteins. 


BL00585A 28.43 1.391e-40 103- 
155 BL00585B18.78 3.250e-30 
193-230 


647 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 9.400e-10 181-192 


648 


PR00876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876C 6. 15 9.229e-09 1 12- 
126 


652 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 5.94 le-27 29-68 


653 


BL00047 


Histone H4 proteins. 


BL00047A 13.53 I.000e-40 2-41 
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BL00047B 6.51 1 .429e-40 4 1-74 
BL00047C 12.18 1.310e-38 74- 
104 




rU\) IUoo 


t*KU 1'jtiiM ZINC FjLNCjJER Z1NO 
FINGER METAL-BINDING NU. 


PD01066 19.43 4.109e-25 30-69 


655 


BL01115 


GTP-binding nuclear protein ran proteins. 


BL01 1 15A 10.22 3.483e-17 19-63 


657 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 8.286e-10 31-40 


658 


BL00125 


Serine/threonine specific protein 
phosphatases proteins. 


BL00125B 21.48 l.000e-40 89- 
135 BL00125C 19.97 1.000e-40 
153-200 BL00125D33.il l.OOOe- 
40 213-268 BL00125A 14.83 
8.941 e-38 47-84 


659 


PD00066 

* 


PROTEIN ZINC-FINGER METAL- 
BINDI. 

• 


PD00066 13.92 8.200e-l 6 492-505 
PD00066 13.92 9.308e-15.380-393 
PD00066 13.92 6.000e-13 352-365 
PDO0066 13.92 7.000e-13 240-253 
PD00066 13.92 7.500e-13 268-281 
PD00066 1 3.92 7.500e-l 3 408-421 
PD00066 13.92 2.174e-ll 464-477 
PD00066 13.92 1.000e-l 0 436-449 


660 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.189e-26 29-68 


661 

• 


BL00795 

• 


Involucrin proteins. 

» 

• 


BL00795C 17.06 7.882e-15 193- 
238 BL00795C 17.06 3.797e-13 
187-232 BL00795C 17.06 5.0 14e- 
13 188-233 BL00795C 17.06 
4.506e-12 196-241 BL00795C 
17.06 7.896e-12 191-236 
BL00795C 17.06 1.667e-ll 185- 

230 BL00795C 1 7.06 2.000e-l 1 
198-243 BL00795C 17.06 3.778e- 
11 171-216 BL00795C 17.06 
6.111e-ll 197-242 BL00795C 
17.06 6.444e-ll 194-239 
BL00795C 17.06 8.000e~ll 189- 
234 BL00795C 17.06 8.556e-ll 
192-237 BL00795C 17.06 1.733e- 
10 195-240 BL00795C 17.06 
2.779e-10 184-229 BL00795C 
17.06 4.035e-10 199-244 
BL00795C 17.06 5.081e-10 186- 

231 BL00795C 17.06 6.965e-10 
190-235 BL00795C 17.06 2.700e- 
09200-245 BL00795C 17.06 
5.800e-09 175-220 BL00795C 
17.06 6.500e-09 182-227 
BL00795C 17.06 6.600e-09 201- 
246 BL00795C 17.06 6.600e-09 
202-247 BL00795C 17 06 6 600e- 
09 208-253 


662 


BL00469 


Nucleoside diphosphate kinases proteins. 


BL00469 22.22 1.000e-40 149-204 


663 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 1 9.54 9.4 lle~l 1331- 
385 


664 


BL00601 


Tryptophan pentad repeat proteins (IRF 
family) proteins. 


BL00601 A 20.29 5.500e-23 7-46 
BL00601B 20.92 3.63 le- 13 69-98 


665 


BL00082 


Extradiol ring-cleavage dioxygenases 
proteins. 


BL00082A 19.07 8.6 15e- 12 49-72 


666 


DM01537 


kw SKI2W SKJ2 NUCLEOLAR 

• 


DM01 537B 21.63 4.073e-37 834- 
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ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






HEL1CASE. 


881 DM01537B 21.63 9.750e-2l 
1669-1716 DM01537A 15.14 
8.650e-l 8 698-718 DM01537A 
15. 14 6.766e-12 1537-1557 


667 


DM01537 


kw SKI2W SKI2 NUCLEOLAR 
HELICASE. 


DM01537B 21.63 7.923 e-38 820- 
867 DM01537B 21.63 9.750e-21 
1655-1702 DM01537A 15.14 
8.650e- 18 684-704 DM01537A 
15.14 6.766e-12 1523-1543 


669 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 6.786e-24 849- 
880 BL00107B 13.31 6.727e-13 
916-932 


670 


BL00299 


Ubiquitin domain proteins. 


BL00299 28.8*9.735e-27 37-89 


671 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.57 le- 12 432-475 


676 


PR00861 


ALPHA-LYTIC ENDOPEPTIDASE 
SERINE PROTEASE (S2A) 
SIGNATURE 


PR00861E 9.88 2.385e-09 206- 
221 


678 


BL00225 


Crystallins beta and gamma 'Greek key 1 
motif proteins. 


BL00225B 18.06 7.5I7e-24 1805- 
1840 BL00225B 18.06 8.297e-20 
1987-2022 BL00225B 18.06 
2.575e-19 1896-1931 BL00225B 
18.06 8.200e-l 9 175-210 
BL00225B 18.06 8.200e-19 1698- 
1733 BL00225B 18.06 4.808e-14 
73-108 BL00225B 18.06 4. 808e- 
14 1596-1631 BL00225B 18.06 
5.500e-14 2077-2112 BL00225A 
13.82 5.829e- 12 2043-2064 
BL00225A 13.82 3.127e-09 1759- 
1780 


679 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320C 13.01 4.240e-10 169- 
184 PR00320A 16.74 6.294e-10 
169-184 


680 


BL00243 


Integrins beta chain cysteine-rich domain 
proteins. 


BL00243I 31.77 1.143e-ll 172- 
215 


681 


PR00852 


XERODERMA PIGMENTOSUM 
GROUP D PROTEIN SIGNATURE 


PR00852H5.90 1.000e-29612- 
635 PR00852E8.14 3.769e-27 
348-371 PR00852D 11.38 8.875e- 
27 309-331 PR00852B 11.08 
2.800e-25 249-269 PR00852I 
17.26 3.500e-25 683-704 
PR00852F 1 1 .85 5.909e-24 379- 
398 PR00852G 1 6. 1 9 4.462e-23 j 
468-486 PR00852C 8.81 9.143e- 
23 284-303 


682 


BL50058 


G-protein gamma subunit profile. 


BL50058 27.23 l.375e-35 15-63 


685 

• 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 


BL00972A 11.93 7.500e-20 40-58 
BL00972D 22.55 3.903e-16 300- 

*m *mi mm mm*, w Mm, Am. jm. Mwmmm v ^ a ^» « j. a « 

325 BL00972B9.45 1.000e-13 
1 1 325-347 


687 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 4.273e-14 98- 
138 


688 


BL00388 


Proteasome A-type subunits proteins. 


BL00388A 23.14 1.000e-40 8-54 
BL00388B 3 1 .38 3.864e-33 66- 
108 BL00388D 20.71 I.0OOe-21 
153-184 BL00388C 18.79 8.147e- 
16 126-148 


689 


PD02796 


PROTEIN STEROL CARRIER LIPID- 


PD02796B 20.92 1.105e-15 347- 
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SEQ 

n> 


ACCESSION 
NO. 


DESCRIPTION 


I RESULTS* 






TRAN. 


394 


691 


PD01572 


PHOTOSYSTEM 11 REACTION 
CENTRE T PROTEIN PHOTOS. 


PD01572 8.77 4.083e-09 1-31 


692 


BL00028 


Zinc finger, C2H2 type, domain proteins. 


BL00028 16.07 7.600e-10 488-505 


694 


BL01013 


Oxysterol-binding protein family 
proteins. 


BL01013A 25.14 9.357e-33 527- 
563 BL01013D 26.81 8.235e-23 
814-858 BL01013C 9.97 6.21 le- 
14 615-625 BL01013B 11.33 
3.605e-13 592-603 


695 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 3.571e-13 164-178 
PD00289 9.97 8.650e-l 1 2147- 
2161 PD00289 9.97 2.552e-09 23- 
37 


698 


PR00161 


NICKEL-DEPENDENT 
HYDROGENASE/B-TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4.930e-09 282- 
302 


700 


PR00749 


LYSOZYME G SIGNATURE 

4 


PR00749F 13.63 8.636e-13 139- 
156 PR00749H8.22 3.681e-12 
173-194 PR00749B 16.54 1.419e- 
11 48-70 PR00749C 7.26 3.060e- 
1172-91 PR00749A 10.33 
4.815e-10 24-45 


703 

* 


PR00704 


CALPAIN CYSTEINE PROTEASE (C2) 
FAMILY SIGNATURE 


PR00704I 9.52 1.000e-29 476-505 
PR00704D 1 1.05 2.500e-27 132- 
158 PR00704E 12.55 5.500e-27 
162-186 PR00704F 13.61 l.OOOe- 
22 187-215 PR00704G 13.87 
1.237e-21 317-339 PRD0704H 
13.38 8.138e-21 367-385 
PR00704A 14.68 2.125e-19 27-5 1 
PR00704C 11.88 1.257e-17 96- 
113 PR00704B 17.94 1.833e-15 
72-95 ! 


705 


PR00859 


PROKARYOTE METALLOTHIONEIN 
SIGNATURE 


PR00859C 7.06 2.776e-09 94-1 1 1 


706 


BL00226 


Intermediate filaments proteins. 


BL00226D 19.10 9.581e-26 369- 
416 BL00226B 23.86 3.250e-24 
203-251 BL00226C 13.23 8.269e- 
21 268-299 BL00226A 12.77 
8.200e-14 103-118 


707 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A4.31 2.440e-102-15 


708 


BL00361 


Ribosomal protein S 1 0 proteins. 


BL00361B 18.34 5.101e-10 82- 
105 


709 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A 4.31 2.200e-10 2-15 


710 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 8.412e-27 160- 
197 BL00514E 14.28 8.909e-16 
219-236 BL00514H 14.95 1.551e- 
15 317-342 BL00514G 15.98 
7.7S0e-15 284-314 BL00514D 
15.35 4.789e-10 201-214 


711 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 8.714e-12 49-90 


714 


BL00400 


LBP / BPI / CETP family proteins. 


BL00400C 24.53 6.029e-17 158- 
202 BL00400D 23.26 2.080e-I4 
222-259 BL00400A 21.59 1.600e- 
10 27-59 


715 


BL01154 


RNA polymerases L / 13 to 16 Kd 


BL01 154B 24.55 5.500e-36 40-76 
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SEQ 
NO: 


ACCESSION 


DESCRIPTION 


RESULTS* 






subunits proteins. 


BL01154A 18.70 3.000e-22 19-40 


716 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 9.786e-32 10-49 

• 


717 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 9.206e-14 77- 
102 BL00215A 15.82 8.412e-10 1 
175-200 


719 


BL00309 


Vertebrate galactoside-b Lading lectin 
proteins. 


BL00309C 18.65 2.24 le-09 62-87 


726 


BL00687 


Aldehyde dehydrogenases glutamic acid 
proteins. 


BL00687E 25.37 7.136e-33 266- 
3 16 BL00687D 26.00 5.333e-28 
151-198 BL00687B 17.54 3.647e- 
26 39-81 BLO0687C 24.13 
6.087e-22 96-133 BL00687F9.55 
2.500e-ll 352-363 


727 


DM01354 


kw TRANSCRIPTASE REVERSE II 
ORF2. 


DM01354N 13.17 1.000e-40 129- 
174 DM01354O8.73 6.605e-15 
180-226 


734 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BI. 


PD00301A 10.24 6.400e-09 101- 
112 


735 


BL01024 


Protein phosphatase 2A regulatory 
subunit PR55 proteins. 

* 

* 


BL01024A 10.26 1.000e-40 22-69 
BL01024B 8.91 1.000e-40 86-127 
BL01024C 7.80 1.000e-40 146- 
185 BL01024D 13.22 1.000e-40 
185-222 BL01024E11.961.000e- 
40 222-266 BL01024F 9.42 

I. 000e-40 266-317 BL01024G 

II. 09 1.000e-40 3 17-349 
BL01024H 13.88 1.000e-40 389- 
442 


736 


PF00913 


Trypanosome variant surface 
glycoprotein. 


PF00913D 11.90 7.130e-10 24-51 


737 


PR00700 


PROTEIN TYROSINE PHOSPHATASE 
SIGNATURE 


PR00700D 12.47 2.200e-09 82- 
101 


740 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320C 13.01 1 .600e-09 68-83 
PR00320A 16.74 7.366e-09 68-83 


743 


PR00871 


DNA 

NUCLEOTIDYLEXOTRANSFERASE 
(TDT) SIGNATURE 


PR00871G 14.48 8.000e-09 178- 
201 


745 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 2.286e-10 33-42 


749 

* 


BL00215 

■ 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 5.200e-15 221- 
246 BL00215A 15.82 7.6 18e-14 
20-45 BL00215A 15.82 8.851 e- 11 
123-148 BL00215B 10.44 9.526e- 
11 69-82 BL00215B 10.44 
7.300e-09 272-285 BL00215B 
10.44 8.500e-09 165-178 


751 


BL50002 


Src homology 3 (SH3) domain proteins 
profile. 


BL50002A 14.19 1.000e-14 370- 
389 BL50002B 15.18 2.200e-10 
408-422 


752 


BL00353 


HMG1/2 proteins. 


BL00353B 11.47 3.089e-12 390- 
440 


753 


PF00622 


Domain in SPIa and the RYanodine 
Receptor. 


PF00622B 21.00 4.214e-I4 47-69 


754 


BL00211 


ABC transporters family proteins. 


BL00211A 12.23 8.941e-10 66-78 


755 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 7.750e-19 392- 
415 PR00926C 16.07 5.935e-17 
253-274 PR00926D 10.53 2.059e- 
15 301-320 PR00926E 11.70 
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NO: 
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MA 


DESCRIPTION 


RESULTS* 








4.97 le- 15 344-363 PR00926B 
16.07 9.526e-13 210-225 

T>T> A nOO C A 1 A A 1 1 CM,, 1*^1 r\*T 

j*KU09zoA IU.41 I.5l4e-l2 197- 
21 1 


/JO 


D JLU 1 1 o / 

• 


oaicium-Dtnaing is<jr-iiKe aomain 
proteins pattern proteins. 


BL01 187A 9.98 2.125e-12 324- 
336 BL0 1 1 87 A 9.98 4.789e- 1 1 
377-389 BL01187B 12.04 3. 057e- 
10 439-455 


757 


PF00651 


BTB (also known as BR-C/Ttk) domain 
.proteins. 


PF0065 1 1 5.00 4.429e-10 43-56 


758 


PR00055 


HTv TAT DOMAIN SIGNATURE 


^^k^^A ^\ a\ *\ Am Am a ^\ ^ a% ^% a« « a» ^ a*, « a a 

PR00055A 8.13 8.855e-09 144- 
156 


759 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 5.304e-ll 110-123 


760 


PR00448 


NSF ATTACHMENT PROTEIN 
SIGNATURE 0 


PR00448D 12.42 3.455e-27 162- 
186 PR00448A 10.74 1.273e-22 
37-57 PR00448B 16.01 9.379e-21 
100-118 PR00448C 11.46 l.OOOe- 
20 129-147 


765 


BL01042 


Homosenne dehydrogenase proteins. 


BL01042A 13.29 5.909e-ll 74-95 


766 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625A 12.84 2.154e-18 26-46 
PR00625B 13.48 9.000e- 16 57-78 


768 


BL00762 


WHEP-TRS domain proteins. 


BL00762A 23.43 8.500e-28 1 12- 
149 BL00762B 16.14 3.793e-12 
64-78 BL00762A 23.43 6.625e-12 
6-43 BL00762C 15.58 4.1 76e-09 
459-472 BL00762D 11.15 9.667e- 
09 210-220 


769 


PR00709 


AVIDIN SIGNATURE 


PR00709A4.60 1.934e-09 1-20 


770 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 

• 


PR00320C 13.01 1.720e-10262- 
277 PR00320A 16.74 2.853 e- 10 
262-277 PR00320C 13.01 4.300e- 
09 96-111 PR00320B 12.19 
5.500e-09 262-277 PR00320A 
16.74 6.268e-09 55-70 


771 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 8.714e-12 87- 
101 PR00019A 11.19 1.000e-10 
90-104 


772 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 


PD02807C 8.91 6.308e-10 110- 
159 


773 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 


PD02807C 8.91 6.308e-10 155- 
204 


774 


DM00547 


1 kw CHROMO BROMODOMAIN 
SHADOW GLOBAL. 

• 


DM00547F 23.43 3.942e-28 943- 
990 DM00547E 13.94 9.750e-21 
652-675 DM00547B 11 .28 
1.818e-l 8 518-532 DM00547C 
17.30 3.53Ie-17 546-568 
DM00547A 12.38 1.273e-ll 497- 
509 DM00547D 1 1 .60 9.200e-l 1 

OZZ-OJO 


776 


PR00779 


INOSITOL 1,4,5-TRISPHOStHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 


PR00779F 14.51 5.147e-09 769- 
792 


111 


PR00779 

• 


INOSITOL 1,4,5-TRISPHOSPHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 


PR00779F 14.51 5.147e-09 742- 
765 


11% 


PR00779 


INOSITOL 1.4,5-nUSPHOSPHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 


PR00779F 14.51 5.147e-09 742- 
765 
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ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 

• 


1 RESULTS* 


779 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 2.543e-09 6-45 


781 


PR00205 


CADHERIN SIGNATURE 


PR00205B 1 1 .39 3 . 11 8e- 1 1 654- 
672 PR00205B 1 1.39 8.588e-l 1 
230-248 PR002Q5B 11.39 8.527e- 
10 551-569 PR00205B 11.39 
4.203e-09 336-354 


783 


BLO0625 

- 


Regulator of chromosome condensation 
(RCC1) proteins. 


BL00625B 17.69 2.167e-19 193- 
227 BL00625A 16.21 5.500e-17 
199-228 BL00625B 17.69 1.885e- i 
16 140-174 BL00625B 17.69 
2/770e- 16 245-279 BL00625A 
16.21 9.115e-16251-280 
BL00625A 16.21 6.507e-14 146- 
175 


785 


PF00084 


Sushi domain proteins (SCR repeat 
proteins. 


PF00084B 9.45 7.188e-10 595-607 
PF00084B 9.45 6.400e-09 656-668 


786 


PF00084 


Sushi domain proteins (SCR repeat 
proteins. 


PF00084B 9.45 7.188e-10 595-607 [ 
PF00084B 9.45 6.400e-09 656-668 | 


787 


BL00826 

* 


MARCKS family proteins. 


BL00826C 7.63 6.738e-09 203- \ 
230 J 


788 


PR00453 


VON WILLEBRAND FACTOR TYPE 
A DOMAIN SIGNATURE 


PR00453A 12.79 1.310e-14 36-54 
PR00453B 14.65 8.568e-10 75-90 | 


789 


PR00102 


ORNITHINE 

CARBAMOYLTRANSFERASE 
SIGNATURE 


PR00102B 14.82 5.418e-09 963- 
977 


790 


BL00030 


Eukaryotic RNA- binding region RNP-1 
proteins. 


BL00030B 7.03 5.500e-l I 199- 
209 J 


791 


BL00415 

• 


Synapsins proteins. 


BL00415N 4.29 9.519e-10 393- 
437 BL00415N 4.29 2.1 17e-09 
103-147 BL00415N4.29 3.628e- 
09 97-141 BL00415N4.29 
5.664e-09 387-431 


795 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.091e-36 105-144 


799 


PF00731 


AIR carboxylase. 


PF00731C 23.16 7.333e-35 337- 
380 PF0073 IB 19.47 7.429e-28 
299-336 PF00731A J9.32 6.333e- 
24 268-297 


804 


BL00170 


Cyclophilin-type peptidyl-proryl cis-trans 
isomerase signatur. 


BL00170B 20.97 8.071e-09 297- 
337 I 


805 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3.400e-10 378-389 
BL00678 9.67 5.800e-10 418-429 
BL00678 9.67 8.800e-10 295-306 | 


806 


PD017I9 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 7.571e-14 290- 
318 


807 


PR00320 


O-FROIJEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320B 12.19 9.100e-09451- 
466 J 


809 


BL00107 


Protein kinases ATP-binding region 

* • 

proteins. 


BL00107A 18.39 4.462e-12 564- 
595 j 


sin 

olU 




VUW WIIJiKHKAND r AC I OR TYPE 
A DOMAIN SIGNATURE 


PR00453A 12.79 1.310e-14 36-54 
PR00453B 14.65 8.568e-10 75-90 | 


814 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01 066 1 9.43 2.047e-3 116-55 


815 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.047e-3I 16-55 


817 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 5.154e-36 125- 
154 PR00193E 19.47 3.9l9e-18 
179-208 | 


818 


PR00830 


ENDOPEPTIDASE LA (LON) SERINE 


PR00830A 8.41 9.571e-ll 1 15- | 
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DJESCKUTI ION 


RESULTS* 






rKU IJ&Aoo (&iO) olONA I UKJfc 


135 


819 


BL00126 


S'S'-cyclic nucleotide phosphodiesterases 
proteins. 


BL00126C 22.07 7.857e-24 528- 
569 BL00126E 35.22 3.714e-lS 
669-724 BL00126D 25.50 1.173e- 
14 584-623 BL00126B 15.20 
1.000e-12 502-514 BL00126A 
27.5 6 3 .36 1 e-09 46 1 -498 


820 


PR00511 


TEKTIN SIGNATURE 


PR00511B 12.25 8.826e-22 174- 
195 PR00511A 13.59 7.723e-l 1 
155-172 


821 


BL00741 


Guanine-nucleotide dissociation 
stimulators CDC24 family sign. 


BL00741B 14.27 2.800e-15 13-36 


822 


PF0O780 


Domain found in NIK 1 -like kinases, 
mouse citron and yeast ROM. 


PF007801 14.69 4.825e-09 231- 
261 


827 


BL00030 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030A 14.39 5.235e-ll 144- 
163 


828 


BL00326 


Tropomyosins proteins. 


BL00326D 8.76 9357e-l 1 545- 
586 


829 

• 


PD02448 


TRANSCRIPTION PROTEIN DNA- 
BINDIN. 


PD02448A 9.37 1.000e-40 46-85 
PD02448B 10.17 1.000e-40 85- 
133 PD02448C 13.62 1.000e-40 
152-189 PD02448Ell.33 9.000e- 
30235-261 PD02448F 14.22 
9.654e-25 279-303 PD02448D 
11.48 3.659e-18 197-211 
PD02448G 10.73 7.857e- 16 SOS- 
SIS 




BJL00720 


Guanine-nucleotide dissociation 
stimulators CDC25 family sign. 


BL00720B 16.57 4.500e-23 483- 
507 


031 


T5T f\f\ i ATI 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 6.625e-21 143- 
174 BL00107B 13.31 4.2 14e- 10 
213-229 


832 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 5.787e-l 1 32-57~ 


833 


PR00497 


NEUTROPHIL CYTOSOL FACTOR 
P40 SIGNATURE 


PR00497A 6.92 4.375e-09 41-59 


834 


BL00229 


Tau and MAP proteins tubulin-binding 
domain proteins. 


BL00229A 23.57 9.565e-10 99- 
138 


835 


BL00421 


Transmembrane 4 family proteins. 


BL00421E 20.97 2.216e-09 1053- 
1083 


836 


BL00795 


Invohicnn proteins. 


BL00795B 12.41 7.93 le-09 405- 
445 


i on 
837 


PR00020 


MAM DOMAIN SIGNATURE 

* 


PR00020A 18.17 1.000e-17 34-53 
PR00020B 15.52 5.846e-16 68-85 
PR00020D 12.70 2.543e-15 147- 
162 PR00020C 13.66 3.483e-13 
95-107 PR00020E 8.64 6.586e-J3 
165-179 | 


838 


BL50017 


Death domain proteins profile. 


BL50017B 17.60 6.897e-13 1499- f 
1515 




rruuojU 


riistone aeacetyiase tamily. 


Pr0085OC 14.55 9.542e-09 1352- 1 
1369 J 


840 


PF00023 


Ank repeat proteins. 


PF00023 A 1 6.03 4.500e- 1 2 44-60 
PF00023B 14.20 7.923e-l 1 73-83 j 
PF00023B 14.20 9.000e-10 139- 
149 PF00023B 14^0 5.500e-O9 
40-50 j 


842 


BL01194 


Ribosomal protein L15e proteins. 


BL0U94B 13.66 1.000e-40 37-85 
BL01 194C 12.35 9.250e-40 103- f 
138 BL01 194A 18.70 7.632e-38 | 



179 



WO 01/57190 PCT/US01/04098 



SEQ 
IB 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








2-37 BL01194D 19.02 2.658e-36 
139-178 


843 


BL00610 


Sodiumineurotransmitter symporter 
family proteins. 


BL00610A 17.73 1.000e-40 40-90 
BL00610B 23.65 1.000e-40 104- 
154 BL00610C 12.94 1.000e-40 
206-258 BL00610E 20.34 l.OOOe- 
40 355-398 BL00610F 29.02 
1 .000e-40 454-509 BL006 1 0D 
20.97 6.063e-35 272-325 
BL00610G 12.89 8.588e-13 514- 
537 


845 


BL00143 


Insulinase family, zinc-binding region 
proteins. 


BL00143A 20.91 4300e-20 94- 
121 BL00143C 14.16 5.500e-13 
245-258 BL00143B 14.41 9.053e- 
10 141-156 


o4o 


PR00543 


OESTROGEN RECEPTOR 
SIGNATURE 


PR00543D 10.87 1.355e-09 898- 
914 


847 


PR00543 


OESTROGEN RECEPTOR 
SIGNATURE 


PR00543D 10.87 1.355e-09 898- 
914 


O A 0 

848 


BL00824 


Elongation factor ] beta/betaVdelta chain 
proteins. 


BL00824C 14.58 1.000e-40 129- 
167 BL00824D 14.04 6.192e-39 
167-202 BL00824B 9.21 2.080e- 
21 96-116 BL00824E 12.49 
3.333e-19 210-226 BL00824A 
13.78 8.650e-14 19-34 




FDOlOoo 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 1.000e-40 12-51 




rDUlOoo 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.316e-24 10-49 


852 


BL01272 


Glucokinase regulatory protein family 
proteins. 


BL01272B 19.61 6.870e-30 136- 
171 BL01272C 11.68 3.314e-25 
249-274 BL01272A 6.49 1.23 le- 
18 99-117 


853 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 9.341e-20 65- 
106 


854 


PD00289 


PRO JWilN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 6.850e-l 1 140-154 


858 


PR00450 

• 


RECOVERIN FAMILY SIGNATURE 

■ 


PR00450C 12.22 3.250e-25 68-90 

V^^P^ A<* A A^ A A AW n ^ — — — — 

PR00450B 1 1.76 8.125e-23 22-42 
PR00450D 16.58 8.920e-22 92- 
112 PR00450E 12.14 1.58Ie-19 
114-133 PR00450G 15.33 5.500e- 
19 166-187 PR00450F 12.30 

A "9*1 C o If 1 in yes Tvn r\r\ A er\ a 

4.375e-15 140-156 PR00450A 
13.58 1.857e-14 8-23 


860 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 7.188e-27 74-1 17 


866 


BL00477 


AIpha-2-macroglobulin family thiolester 
region proteins. 


BL00477L 23.51 7.480e-20 54-87 


867 


BL01078 


Molybdenum cofactor biosynthesis 
proteins. 


BL01078B 14.20 1.621e-20408- 

/f70 X2T AYATO A m 1£ 1 AAA— 14 

rJi-rUIU/oA lO.lo 2.000e-l3 

366-379 BL0l078D5.99 3.455e- 
ll 566-576 BL01078C 10.52 
3.793e-ll 501-513 


868 


BL0H77 


Anaphylatoxin domain proteins. 


BL0 1 177E 20.64 5.800e-24 462- 
489 BL01177C 17.39 5.333e-19 
416-435 BL01177B 13.61 7.840e- 
16 122-138 BL01177D 17.50 
1.900e-15 441-459 


869 


BL0H77 


Anaphylatoxin domain proteins. 


BL01 177E 20.64 5.800e-24 415- 
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442 BL01177C 1739 5.333e-19 
369-388 BL01177B 13.61 7.840e- 1 
16 122-138 BL01 177D 17.50 
1.900e- 15 394-412 | 


i • 0*7 1 
87 1 


BL50Q07 


Pnospnandylinositol-specific 
phospbolipase X-box domain proteins 
prof. 


BL50007A 19.61 1.000e-40 322- I 
368 BL50007D 19.54 1.000e-40 
589-631 BL50007B 20.90 6.700e- 
36 383-421 BL50007E 25.63 
9.053e-33 748-785 BL50007C 
8.97 5.200e- 1 9 452-469 


872 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 


BL00972D 22,55 3.250e-17 90- 
115 




PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 1 1 .65 4.250e-09 370- f 
386 


877 


BL00741 


Guanine-nucleotide dissociation 
stimulators CDC24 family sign. 


BL00741B 14.27 5.500e-13 1343- 
1366 1 


878 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM002 1 5 1 9.43 2.525e-09 52-85 | 


881 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 


PD02807E 10.90 4.702e-09 358- 
407 1 


882 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.188e-37 8-47 


885 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 8.071e-09 10-26 | 


886 


PR00372 


BIOPTERIN-DEPENDENT 
AROMATIC AMINO ACID 
HYDROXYLASE SIGNATURE 


PR00372B 10.30 9.308e-27 225- 
248 PR00372A 13.39 7.000e-24 j 
134-154 PR00372E 12.62 2.125e- 
23 360-380 PR00372C 7.90 I 
3.025e-22 289-309 PR00372F 
13.09 6.333e-21 395-414 j 
PR00372D 10.22 1.000e-19 329- 
348 


887 


BL00301 


G ll*-binding elongation factors proteins. 


BL00301B 20.09 2.800e-24 103- 
135 BL00301A 12.41 4.3 16e-13 
21-33 1 


888 


BL00518 


Zinc finger, C3HC4 type (RING ringer), 
proteins. 


BL00518 12.23 1.667e-09 30-39 


889 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 4.906e-26 6-45 


890 


DM00179 


w KINASE ALPHA ADHESION T- 
CELL. 


DM00179 13.97 7.652e-09 113- 
123 J 


892 


BL01022 


PTK2 family proton/oligopeptide 
symporters proteins. 


BL01022B 22.19 6.016e-14 72- 
118 BL01022E23.51 1.173e-12 ! 
472-508 BL01022A 11.58 9.135e- 
12 42-61 BL01022D9.42 3.455e- 
11199-212 J 


893 


PD02407 


3-BISPHOSPHOGLYCERATE- 
INDEPENDENT PHOSPHOGLYCER. 


PD02407K 1 2.59 $.529e-l 0 360- 
383 


894 


PD02407 

* 


3-BISPHOSPHOGLYCERATE- 
INDEPENDENT PHOSPHOGLYCER. 


PD02407K 12.59 6.529e-10 360- j 
383 J 


895 


PR00237 


RHODOPSEN-LUCE GPCR 

CT TDUUT7 A 'KyfTT V Oir'X'J A *T*I TO r? 


PR00237B 13.50 9.100e-14 116- 
138 PR00237F 13.57 1.360e-13 | 
312-337 PR00237G 19.63 9.069e- 
13 353-380 PR00237E 13.03 j 
7.120e-12 243-267 PR00237D 
8.94 4.150e-ll 194-216 
PR00237A 11.48 4375e-ll 83- 
108 


896 


BL00129 


Glycosyl hydrolases family 31 proteins. 


BL00129D 16.76 8-258e-26 634- 
678 BL00129A 26.21 1.720e-25 
384-430 BL00I29E 22.60 4.857e- 
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23 698-734 BL00129C 15.12 
1.750e-22 596-624 BL00129B 
19.19 5.891e-18495-522 
BL00I29F 26.19 7.545e-15 814- 


897 


BL00598 


J Chromo domain proteins. 


BL00598 14.45 1.220e-13 9-31 


898 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 6.000e-09 396-405 


899 


PD01101 


INHIBITOR HEAVY CHAIN 

1 T A VTVTT'T T~W T 

CHANNEL IN. 


PD01101B 21.53 1.000e-40 274- 
327 PD01101D 24.45 1.000e-40 
457-512 PD01 101 A 18.25 6.268e- 
23 83-117 PD01101C 12.69 
1.237e-16 366-386 PD01101E 
6.73 7.750e-12 566-576 


900 


PR00600 


PROTEIN PHOSPHATASE PP2A 55KD 
REGULATORY SUBUNIT 
SIGNATURE 


PR00600A 11.61 5.979e-09 31-52 


901 


PD01066 


PROTEIN ZINC FINGER ZINC- 
1 FINGER METAL-BINDING NU. 


PD01066 19.43 8.116e-31 24-63 


903 


BL01I15 


[ G TP-binding nuclear protein ran proteins. 


BL01115A 10.22 1.509e-ll 21-65 


906 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.174e-13 539- 
572 DM00215 19.43 4.750e-12 
549-582 DM00215 19.43 9.824e- 
11 551-584 DM00215 19.43 
2.929e- 10 548-581 DM00215 
19.43 4.054e-10 550-583 
DM00215 19.43 5.339e-10 552- 
585 DM00215 19.43 7.1 07e- 10 
544-577 


907 


PR00988 


I mirVTVTT? V1XT A OC C 1 T/~"XT A TT TT> TJ 

URIDINE KINASE SIGNA 1 URE 


PR009ooA 0.35J O.Z /oe-lZ 314- 
332 


I j 


BL00107 


! Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 5.95Ue- 17 1125- 

I 1 JO 


909 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 5.950e-17 1118- 
1149 


910 


BL00107 


Protein kinases AlP-binding region 
proteins. 


BL00107A 18.39 8.560e-13 150- 
181 


911 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 8.560e-13 150- 
181 


912 


PF00856 


SET domain proteins. 


PF00856A 26.14 4.553e-l 1 243- 
280 


913 


PF00628 


PHD-finger. 


PF00628 15.84 6.400e-13 197-212 


914 


PR00962 


LETHAL(2) GIANT LARVAE 
PROTEIN SIGNATURE 


PR00962D 10.40 1.000e-27435- 
459 PR00962G 15.71 4.086e-26 
593-618 PR00962B 11.98 9.122e- 
26 296-319 PR00962A 13.28 
6.143e-22 15-34 PR00962C 8.00 
4.000e-21 348-369 PR00962F 
12.39 9.769e-21 552-572 

JrKUUyoxri IJ.Jx x.OJoe-xU OZi- 
643 PR009621 1 1 .68 9.786e-20 
692-712 PR00962E 8.81 2.91 5e- 
18 515-534 


915 


PR00962 


LETHAL(2) GIANT LARVAE 
PROTEIN SIGNATURE 


PR00962D 10.40 1.000e-27365- 
389 PR00962G 15.71 4.086e-26 
523-548 PR00962A 13.28 6.143e- 
22 15-34 PR00962C 8.00 4.000e- 
21 278-299 PR00962F 12.39 
9.769e-21 482-502 PR00962H 
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13.32 2.636e-20 553-573 

PR009621 1 1.68 9.786e-20 622- 

642 PR00962E8.81 2.915e-18 
445-464 


916 


BL00134 


Serine proteases, trypsin family, histidine 
proteins. 


BL00134A 11.96 5.886e-14 90- 
107 


917 


BL00478 


LIM domain proteins. 


BL00478B 14.79 8.393e-13 21 1- 
226 BL00478B 14.79 6.712e-10 
271-286 


918 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 5.729e-09973- 
988 


922 


BL00150 


Acylphosphatase proteins. 


BL00150 25.33 1 .000e-40 37-84 


924 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 8.063e-09 79- 
113 


925 


BL00072 


Acyl-CoA dehydrogenases proteins. 


BL00072D 30.08 2.837e-24 280- 
331 BL00072E 24.12 8.200e-24 
368-41 1 BL00072C 25.30 7.873e- 
20 226-267 BL00072B9.48 
6.049e-12 183-196 


927 


BL00237 


G -protein coupled receptors proteins. 


BL00237C 13.19 1.692e-13 229- 
256 BL00237A 27.68 6.657e-13 
90-130 BL00237D 11.23 9.571e- 
13 290-307 


928 


BL01033 


Globins profile. 


BL01033A 16.94 7.923 e- 18 25-47 
BL0I033B 13.81 1.000e-15 93- 
105 


929 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 8.714e-13 203- 
253 


932 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.519e-10 353- 
397 BL00415N 4.29 2.1 17e-09 
63-107 BL00415N 4.29 3.628e-09 
57-101 BL00415N 4.29 5.664e-09 
347-391 


933 


PD02448 


TRANSCRIPTION PROTEIN DNA- 
BINDIN. 


PD02448A 9.37 1 .OOOe-40 46-85 
PD02448B 10.17 1.000e-40 85- 
133 PD02448C 13.62 1.000e-40 
152-189 PD02448E 11.33 9.000e- 
30 223-249 PD02448F 14.22 
9.654e-25 267-291 PD02448D 
11.48 3.659e-18 197-211 
PD02448G 10.73 7.857e-16 293- 
306 


934 


DM00191 


w SPAC8A4.04C RESISTANCE 
SPAC8A4.05C DAUNORUBICIN. 


DM00191D 13.94 9.083e-10 136- 
175 


935 


BL01115 


GTP-binding nuclear protein ran proteins. 


BL0 1 1 1 5 A 1 0.22 4.696e- 10 67- j 
111 


936 


BL00019 


Actinin-type actin-binding domain 
proteins. 


BL00019D 15.33 8.138e-14 865- 
895 


937 


PR00762 


CHLORIDE CHANNEL SIGNATURE 


PR00762A 14.22 4.000e-22 183- 
201 PR00762C9.29 1.000e-21 
268-288 PR00762E 12.07 3.250e- 
20 520-537 PR00762D 11.29 
1.000e-19 470-491 PR00762F 
15.12 1.429e- 19 538-558 
PR00762B 12.12 1.818e-18214- 
234 PR00762G 14.13 3.455e-17 
577-592 


938 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.500e-25 291-334 


939 


DM01111 


4 kw PHOSPHATASE 


DM01 11 IE 17.28 1.568e-10 248- 
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• 


TRANSFORMING 61 K PDF1. 


297 DM011 HE 17.28 5.168e-10 
659-708 DMOllllD 16.76 
! 5.263e-09 279-325 DM01 111M 
10.67 8.674e-09 91 1-935 


CkA A 


BL0OIO7 


Protein kinases A TP-binding region 
proteins. 


BL00107B 13.31 I.000e-14 293- 
309 BL00107A 18.39 6.760e-13 
229-260 


942 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 9.832e-ll 543- 
597 


943 


PD01066 


PRO I KIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NIL 


PD01066 19.43 3.500e-35 8-47 


945 


BL00989 


Clatnnn adaptor complexes small chain 
proteins. 


BL00989B 26.51 1.000e-40 66- 
117 BL00989A 11.66 1.000e-13 
5-19 


946 


PR00178 


FA 1 TY ACID-BINDING PROTEIN 
SIGNATURE 


PR00178D 13.52 9.571e-09 450- 
469 


947 


BL00178 


Am ino acy 1-transf er RNA synthetases 
class-I proteins. 


BL00178B 7.1 1 4.857e-09 713- 
724 


948 


PF00628 


PHD-finger. 


PF00628 15.84 8.412e-14 201-216 


951 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 2.050e-10 180- 
230 


952 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 4.300e-ll 26-49 
PR00926F 17.75 6.348e-09 134- 
157 


955 


PF00109 


Beta-ketoacyl synthase. 


PF00109 13.08 2.846e-12 342-357 


957 


PR00069 


ALDO-KETO REDUCTASE 
SIGNATURE 


PR00069A 16.01 8.826e-24 26-51 
PR00069B 11.33 1.514e-17 86- 
105 PR00069C 16.03 8.816e-14 
155-173 


958 


PF00583 


Acetyltransferase (GNAT) family. 


PF00583A 12.53 5.500e-10 631- 
642 


961 


PR00328 


GTP-BINDING SARI PROTEIN 
SIGNATURE 


PR00328A 10.62 8.740e-10 7-31 


962 


BL00354 


HMG-I and HMG-Y DNA-binding 
domain proteins (A+T-hook). 


BL00354A 3.83 9.438e-10 1489- 
1499 


963 


BL00354 


HMG-I and HMG-Y DNA-binding 
domain proteins (A+T-hook). 


BL00354A 3.83 9.438e-10 1489- 
1499 


964 


BL00027 


'Homeobox' domain proteins. 


*W 4m. 4m. 4m. J0tm, mmm* mWm\ ^mr ■ _ m mm l mmmt ^ mim*. m*mi 4mm. mmmt mm\ mmm -mm. — 

BL00027 26.43 7.1 88e-27 53-96 j 


965 


PF00992 


Troponin. 


PF00992A 16.67 2.421e-09 581- 

4* « 4W 

616 


966 


PR00515 


5-HYDROXYTRYPTAMINE IF 
RECEPTOR SIGNATURE 


PR00515D 7.91 5.741e-09 13-33 


967 


BL00579 


Ribosomal protein L29 proteins. 


BL00579B 21.99 5.065e-21 164- 
194 


970 


BL00504 


Fumarate reductase / succinate 
dehydrogenase FAD-binding site 
proteins. 


BL00504C 1 8.68 2.227e-24 34-59 
BL00504D 10.43 7.261e-21 75-93 


973 


PF00580 


UvrD/REP helicase. 


PF00580A 13.37 4.720e-09 249- 
271 


y /4 


PKUU456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456F5.86 l.OOOe- 10 242-254 


975 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 4.429e-22 99- 
139 


976 


BL0003 1 


Nuclear hormones receptors DNA- 
binding region proteins. 


BL00031A 19.55 7.158e-33 60-93 
BL00031B 22.25 5.500e-28 94- 
126 


977 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 8.200e-16 196-209 
o PD00066 13.92 8.200e- 16 336-349 
PD00066 13.92 2.385e-15 476-489 
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PD00066 13.92 9.308e-15 252-265 
PD00066 13.92 2.800e-14 448-461 
PD00066 13.92 4.600e-l4 392-405 
PD00066 13.92 5.200e- 14 280-293 
PD00066 13.92 4.000e-13 224-237 
PD00066 13.92 4.429e-12 308-321 
PD00066 13.92 9.571e-12 420-433 
PD00066 13.92 6.870e-ll 168-181 


978 


BL00721 


Formate— tetrahydrofolate ligase proteins. 


BL00721B 13.21 1.000e-40 346- 
401 BL00721D 13.90 1.000e-40 
538-592 BL00721E 13.46 l.OOOe- 
40 597-646 BL007211 18.79 
2.500e-40 814-860 BL00721H 
21.20 8.239e-39 763-814 
BL00721A 15.31 9.719e-32 287- 
321 BL00721C 16.92 4.000e-30 
498-535 BL00721F 15.96 8.232e- 
27 660-702 BL00721G7.97 
3.017e-l 0 721-734 


981 


PD00126 


PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 


PD00126A 22.53 2.552e-09 180- 
201 


982 


BL00869 


Renal dipeptidase proteins. 


BL00869C 12.58 3.172e-19 59-95 
BL00869E 13.12 9.129e-18 120- 
157 BL00869J 15.60 6.032e-17 
270-310 BL00869H 11.08 1.840e- 

2.543e-16 192-214 BL00869F 
12.77 7.03 le-14 157-192 
BL008691 12.92 3.274e-12 242- 
270 BL00869D 14.02 5.282erl0 
95-124 BL00869B 15.55 9.382e- 
10 31-61 


983 


PR00196 


ANNEXIN FAMILY SIGNATURE 


PR00196F 13.892.125e-09 92-108 


984 


BL00485 


Adenosine and AMP deaminase proteins. 


BL00485D 30.82 2.427e-10 154- 
209 



* Results include in order: accession number subtype; raw score; p- value; position of signature in amino acid 
sequence 



5 TABLE 4 



SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


2 


ig 


Immunoglobulin domain 


3.9e-17 


60.3 


3 


HSP90 


Hsp90 protein 


0 


1548.4 


6 




Thrombospondin type 1 domain 


0.002 


22.1 


7 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


6.7e-08 


27.3 


9 


PWWP 


PWWP domain 


8.1e-16 


66.0 


12 


Clq 


Clq domain 


1.7e-26 


101.5 


13 


Clq 


Clq domain 


2e-20 


81.3 


14 


Aa_trans 


Transmembrane amino acid 
transporter protein 


2.7e-42 


153.9 


15 


El-E2_ATPase 


E1-E2 ATPase 


6.3e-124 


412.2 


16 


trypsin 


Trypsin 


1 .2e-87 


278.6 


17 


Ig 


Immunoglobulin domain 


7.6e-12 


43.2 


18 


lectin c 


Lectin C-type domain 


0.0003 


21.2 


20 


Alpha L fucos 


Alpha-L-fucosidase 


1.2e-217 


7^6.5 
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SEQ Q> 

NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


22 


• • 

pkinase 


Eukaryotic protein kinase domain 


3.3e-87 


303.1 


23 


pkinase 


Eukaryotic protein kinase domain 


2.7e-85 


296.8 


24 


pkinase 


Eukaryotic protein kinase domain 


2.7e-85 


296.8 


25 


ank 


Ank repeat 


5.5e-14 


59.9 


27 


pkinase 


Eukaryotic protein kinase domain 


1.5e-100 


347.4 


28 


spectrin 


Spectrin repeat 


4e-57 


203.2 


29 


spectrin 


Spectrin repeat 


4e-57 


203.2 


30 


WD40 


WD domain, G-beta repeat 


1.2e-07 


38.8 


33 


rrm 


RNA recognition motif. 


l.le-17 


72.2 


34 


rrm 


RNA recognition motif. 


l.le-17 


72.2 


36 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


3e-36 


117.3 


37 


ank 


Ank repeat 


5.9e-25 


96.3 


38 


SRF-TF 


SRF-type transcription factor 


1.4e-36 


133.9 


40 


alk_phosphatase 


Alkaline phosphatase 


0 


1034.9 


44 


zf~C2H2 


Zinc finger, C2H2 type 


8.6e-103 


354.9 


45 


sugar_tr 


Sugar (and other) transporter 


3.1e-08 


40.3 


47 


7tm_2 


7 transmembrane receptor (Secretin 
family) 


6.4e-79 


275.6 


50 


zf-C2H2 


Zinc finger, C2H2 type 


1.3e-98 


341.0 


51 


filament 


Intermediate filament proteins 


1.2e-176 


600.3 


52 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


2.7e-10 


37.7 


53 


Cadherin_C_ter 
m 


Cadherin cytoplasmic region 


1.9e-94 


327.2 


54 


S_100 


S-100/ICaBP type calcium binding 
domain 


5.2e-18 


73.3 


58 


inositol_P 


Inositol monophosphatase family 


5e-13 


49.8 


59 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


8.8e-46 


147.6 


60 


Kunitz_BPTI 


Kunitz/Bovine pancreatic trypsin 
inhibito 


3.7e-47 


148.6 


62 


DAD 


DAD family 


2.5e-74 


2603 


63 


MOZ SAS 


MOZ/SAS femily 


5.9e-133 


455.1 


64 


MOZ SAS 


MOZ/SAS femily 


1.7e-123 


423.6 


65 


ras 


Ras family 


9.3e-89 


308.3 


67 


Hamlp_like 


Haml family 


3.7e-49 


176.7 


68 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


5.2e-39 


126.1 


70 


zf-C2H2 


Zinc finger, C2H2 type 


1.5e-112 


387.3 


71 


Peptidase_M41 


Peptidase femily M41 


1.2e-110 


381.0 


72 


abhydrolase 


alpha/beta hydrolase fold 


9.8e-05 


26.5 


81 


K tetra 


K+ channel tetramerisation domain 


0.022 


-16.8 


82 


pkinase 


Eukaryotic protein kinase domain 


5e-49 


176.3 


84 


AAA 


ATPases associated with various 
cellular act 


1.3e-77 


271.3 


85 


homeobox 


Homeobox domain 


1.4e-28 


108.3 


87 


TGF-beta 


Transforming growth factor beta like 


6.7e-68 


210.2 


91 


mito carr 


Mitochondrial carrier proteins 


4.6e-57 


198.5 


95 


adenylatekinase 


Adenylate kinase 


l.le-15 


60.0 


96 




Immunoglobulin domain 


4.1e-20 


69.8 


99 


CNH 


CNH domain 


3.4e-120 


412.7 


100 


homeobox 


Homeobox domain 


7.4e-32 


119.3 


101 


zf-C2H2 


Zinc finger, C2H2 type 


2.2e-47 


170.8 


102 


zf-C2H2 


Zinc finger, C2H2 type 


4.4e-89 


309.4 


103 


dynamin 


Dynamin family 


1.4e-150 


513.6 


104 


lectin c 


Lectin C-type domain 


4.2e-15 


63.6 


105 


lectin c 


Lectin C-type domain ' 


4.2e-15 


63.6 


108 


metalthio 


Metallothionein 


2e-25 


97.9 
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SEQ n> 

MA. 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


I 1Z 




nspzu/a/pna crystal Mn ramny 


2.6e-20 


11.1 


1 1 s 


PP 


mongaiion iacior io 


3.8e-63 


221.1 


116 


sugar tr 


Sugar (and other) transporter 


4e-63 


223.1 


1 1 8 


cataiase 


i^aiaiase 


0 


« « mm m± *mm. 

1158.9 


1 1 o 




UDiquitin carooxyl-tenninal 
nyaroiase, tamii 


le-10 


AWm, At At 

24.4 


122 


metalthio 


Metallothionein 


2.8e-25 


97.4 


125 


adh snort 


short chaui dehydrogenase 


^ A* A. mm 

L6e-45 


164.6 


126 


KRAB 


KRAB box 


7.9e-25 


95.9 


127 


G-alpha 


G-protem alpha subumt 


le-249 


843.0 


128 


mito can* 


Mitochondrial carrier proteins 


2e-65 


227.2 


131 


^^^^ 4 V^^^. 

EF1BD 


EF-1 guanine nucleotide exchange 
domain 


4.9e-53 


189.6 


132 


GYF 


GYF domain 


4.9e-28 


106.6 


133 


GYF 


GYF domain 


4.9e-28 


106.6 


134 


bpocalin 


Lipocalin / cytosolic fatty-acid 
binding pr 


2.1e-33 


119.1 


135 


pkinase 


Eukaryotic protein kinase domain 


3.3e-86 


299.8 


136 


ank 


Ank repeat 


2^e-29 


111.1 


137 


IL8 


Small cytokines 
(intecnne/chemokine), mter 


3.1e-18 


65.2 


139 


pyndoxal_deC 


Pyridoxal-dependent decarboxylase 
conse 


0.00011 


19.0 


140 


cadherin 


Cadherin domain 


1.3e-88 


307.8 


142 


efhand 


EF hand 


5.7e-33 


123.0 


! 143 


Acyltransferase 


Acyltransferase 


2e-29 


111.2 


146 


cyiochrome_c 


Cytochrome c 


1.7e-33 


124.7 


• A Awm 

147 


pkinase 


Eukaryotic protein kinase domain 


2.3e-86 


300.3 


148 


PDZ 


PDZ domain (Also known as DHR or 

*mmm <w m**m, m v 

GLGF). 


1.7e-09 


45.0 


149 


aldo ket red 


Aldo/keto reductase family 


7.4e-189 


640.8 


150 


homeobox 


Homeobox domain 


3.2e-08 


38.7 


151 


PseudoU synth 
1 


tRNA pseudoundme synthase 


m\ Mwm mm mm* 

4.7e-57 


203.0 


152 


abnydrolase 


alpha/beta hydrolase fold 


1.7e-31 


118.0 


153 


PDZ 


PDZ domain (Also known as DHR or 
GLGF). 


l.le-09 


45.6 


156 


PHD 


PHD-finger 


7.6e-15 


62.8 


157 


m3 


Fibronectm type HI domain 


0.015 


21.9 


158 


homeobox 


Homeobox domain 


2.7e-27 


104.1 


160 


PWI 


PWI domain 


3.9e-24 | 


93.6 


162 


DnaJ 


DnaJ domain 


2e-06 


34.8 


164 


Cbl__N 


CBL proto-oncogene N-ternunal 
domain 


8e-117 


401.5 


1 66 


_ 1x1. J — 

metalthio 


Metallothionem 


3.1e-26 


100.6 


167 


LRU 


Leucine Rich Repeat 


mtm. At*, mm. aw* m* a^_ 

0.00069 


26.3 


169 


fibnnogen_C 


Fibrmogen beta and gamma chains, 
C-term 


5.3e-180 


611.4 


170 


fibnnogen_C 


Fibrmogen beta and gamma chains, 

f"\-term 


5.3e-180 


611.4 


171 


fibrinogen_C 


Fibrinogen beta and gamma chains, 
C-term 


le-149 


510.8 


173 


homeobox 


Homeobox domain 


1.5e-29 


111.6 


174 


FYVE 


FYVE zinc finger 


7.4e-28 


103.8 


175 


GRIP 


GRIP domain 


3.9e-08 


40.5 


182 


pkinase 


Eukaryotic protein kinase domain 


3.4e-71 


250.0 


185 


CAP GLY 


CAP-Gly domain 


5.6e-51 


182.8 


186 


TBC 


TBC domain 


2.2e-50 


180.8 


187 


TBC 


TBC domain 


2.2e-50 


180.8 



187 



WO 01/57190 



PCT/US01/04098 



curt n\ 
NO* 


rF AM NAME 


DESCIUFTION 


p-value 


fin * m « 

PFAM 


188 


PDZ 


PT)Z domain f Alio Vnnwn nc TjWP nr 

GLGF). 


4e-l ^ 




189 


Kelch 


Kelch motif" 

IVvlvIl UlViJl 


•J X W 




190 


TrmiomvosiTi 


Trnti OTfi vo*si n *» 


^ 8^-171 

J . OC" 1 / X 


4 


192 


XVlwoJtw 


Rieske f2Fe-9Sl domain 

XVI & .XIV w ^AiX V^XUl UUJ U dill 


U.UvlO 


lO.J 


199 


1 IF* 


Tmmiinoolohiitin Hnrnnin 


S Op- 10 

J.7C* XJr 


nVi 1 


9fi9 




f?^* W—i iVp Hnmain 
"IJJvC UUlUoIII 


a zip c>i 


TOO C 




trAfrtil 


ireiuii \* -typoy uu in axil 


1a 94 

ie-z*t 


73. J 


ZVJH 


i 1 xjLx 


'1 Hnmain 

idl uoniain 


fi ^a in 


i in o 


zuj 


CIuflDQ 


cr nana 


u.uoyo 


ZZ.O 


206 


ISK_Channel 


Slow voltage-gated potassium 


0.0031 


8.1 


ZU / 


U6I011 


xreiou ^r-iypej aomam 


z.ye-4o 


1 


209 


Ribosomal SI 3 


Ribosomal protein S13/S18 


1.2e-78 


274.7 


Tin 
ZIU 


nemopexin 


Hemopexin 


1.3e-6z 


221.5 






irjc aomam 


2.5e-4o 


174.0 


215 


Basic 


Myogenic Basic domain 


4.3e-50 


179.8 


zio 


Kibosomal Lz4 


^.*~:x* 

K.UW motit 


8.2e-23 


on o 

89.2 


222 


fo3 


Fibronectin type HI domain 


7.3e-141 


481.4 


223 


cofilin_ADF 


Cofilin/tropomyosin-type actin- 
binding pr 


9.3e-47 


168.8 


224 


efhand 


EF hand 


6.1e-06 


33^ 


225 


Pterin_4a 


Pterin 4 alpha carbmolamme 
dehydratase 


9.3e-42 


152.1 


228 


ABC tran 


ABC transporter 


4.1e-110 


379.2 


234 


E l_DerP2_DerF 
2 


El family 


3.7e-90 


312.9 


235 


El_DerP2_DerF 
z 


El iamily 


1.6e-48 


174.6 


ZJ 1 


i^MrZz_ulauain 


JrMr-zz/EMr/MPzO/Claudm tamily 


1.7e-25 


OO 1 

98.1 


ZJo 


\jp iods_jrieur ope 
P 


verteorate endogenous opioids 
neurope 


i o i co 

i.Se-159 


543.2 


ISO 

ZJV 


eir-Da 


bukary otic mitianon factor 5A 
nypusine 


5.9e-104 


-5 CO O 

358.8 


240 


Amino oxidase 


Flavin containing amine oxidase 


2.5e-ll 


37.8 


741 
Z4J 


ZI-LZriZ 


zinc linger, czriz type 


o i » on 

z.ie-yy 




7>M 
Z44 


tsand / 


orrri domain / rsand 7 tamily 


z.Je-Di 


1 OA *T 

190./ 


74 < 
Z4D 


anK 


auk repeat 


i.oe-oo 


*20T C 


74A 


zt-i^zriz 


Ziinc linger, wzriz type 


o,/e-4y 


1 


Z*W 


acnn 


A. /vfin 

/\cnn 


z.je-*tz 


1 4 U.J 


948 


xz>x\. luxixcii xoccp 
t 


T?R IttTViAn TMV\t^in rptom !nrt nkrnnfnr 

xz>xv ltuxicii pruictu rciauxuig rcvepior 

• 


9 4a- 1 <i ^ 1 


jzy.j 


9S0 


PMP99 PlntiHin 


PA4P.99/PVTP/MP9n/r > 1 on/4 in fnmiU/ 
j^xYxi *4^/x^xVijr/ JVxx Zv/^lBlluul Lamiiy 




i4n o 


959 




v^oiiagen uipic neiix repeat ^zu 


T 4a 1 ^ 


jo.O 


955 




f^9 Hnmain 


rt ft^9 


7 ft 


957 


pap nf v 


PAP-fllv Hnmain 


i .*te*zu 


fil ft 


9^n 

zou 




w u aomam, vj-ueia repeat 


O Oa £7 

y.ye-oz 


71 ft ^ 
Zlo.3 


9#\1 

ZOI 




w jl/ domain, vj-DcLa repeat 


O Oo iC7 


71 ft ^ 
Zl o.D 


262 


WD40 


"\X/Ti domain O-hetn T&nMki 




218 5 


263 


cofilin_ADF 


Cofilinytropomyosin-type actin- 
binding pr 


7.8e-21 


82.6 


264 


Ribosomal L14 


Ribosomal protein L14p/L23e 


9.2e-10 


40.6 


265 


SAPA 


Saposin A-type domain 


4.4e-27 


103.4 


266 


SAPA 


Saposin A-type domain 


4.4e-27 


103.4 


267 


ABC tran 


ABC transporter 


9.5e-39 


142.2 


269 


Ribosomal_L14 


Ribosomal protein L14p/L23e 


6.2e-62 


219.2 


270 


abhydrolase 


alpha/beta hydrolase fold 


0.042 


-3.3 


272 


ras 


Ras family 


4.3e-87 


302.8 
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SEQ tD 

Pi 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 


971 


rrm 


ivi^iAv ic\«u^iiiiiuii mum. 


U.U /f 


1A fs 


975 


1 inAAg 1 in 
lipUCclilll 


1 inAf*alin / fvtr»cnlir* fattv.ar*ir4 
L-ilpUL»oJLii / i»jr Lt/auilw laUjr'olflu 

bin din f? nr 


9 g»Jl 1 


146 A 


276 


l ad 


Pqc famiiv 


1 1e-67 

1.1 C" V / 


238 3 


277 


UPH 


f JhifTuirin cai+iowl-terminal 
uuiijuiuu wcu in/A yi ivi uuiiai 

Aljr \Mm\ \J V| 1 111! 4 




503 0 i 

JwJ.7 




QTADT 


START domain 




AA 1 


970 




YX/1 1 H Am a in fl.»l>pti* rpnpnt 
w is uniuaiu, vj ~ ucLa repeal 


1 Rft-97 


104 7 


282 


G-patch 


G-patch domain 


7.8e-22 


86.0 


987 
ZD / 


Anu_prou ierat 


d i \J i iamuy 


1 9» lOI 


jJl.U 






xrr> A t> Kav 
JNJv/VD DOX 


7.ie-zl 


d S i 
oZ.o 


293 


7tm 3 


7 transmembrane receptor 


3.3e-73 


256.6 


295 


SET 


SET domain 


5e-30 


1 in A 

1 13.2 


296 


Pyndox_oxidase 


Pyndoxamme 5 -pnospnate oxidase 


1.3e-7o 


2oo.O 


297 


mil 


RNA recognition motif. 


5.4e-45 


162.9 


298 


Ubiemethyltran 


udiE/COQ5 metnyltransrerase family 


6.3e-05 


-VO.3 


299 


Ubiemethyltran 


UD1E/COQ5 metnyltransrerase family 


0.0024 


-1 lo.l 


301 


Cyt_reductase 


F AJJ/NAJJ-oinaing Cytocnrome 
reouciase 


7.7e-ol 




302 


G -patch 


(jr-patcn aomam 


3.1e-l4 


*<A T 

00./ 


307 


7tm_l 


7 transmembrane receptor (rnodopsin 
iamuy/ 


7.7e-43 


13o.x 


30o 


DTI 


rn aomam 


O OAl ^ 


17 R i 


ha 
310 


7tm_l 


flfM AMU IMAM a tti l_ m j > ■] jm_ m» g mmm. JX. Jl J\. Ifc H ■ Mfc 

/ iransmemorane receptor ^rnoaopsin 

Xa«m • 

iamuy^ 


1 .4e-o4 


Z / l/.o 


i IT 1 
Jl I 


I J |a mm— ^ mm ■■ 

Knoaanese 


Knooanese^uxe aomam 




99#£ 7 i 
ZZO. / 


J 1Z 


tubulin 


i UDuiin/r isz. iamuy 


•♦.ye-zoo 


70J.0 


i 14 

J14 


OT T~D "C/1 


CTTPI74 familu 

oUxsjth iamuy 


1 0/» 1QO 


67/5 6 i 




IMo 


unpivinucivsarnx? iamuy 




9fl7 S 


JZ 1 


caanerin 


i^aanenn aomam 




^16 0 


190 




^VT A t ^ #1 om 

INAv Qomalu 


9 1 o-9 R 


107 R 


no 


JLr trans 


x'nospnaiiayunosiioi uansier proiem 


O.JC"70 


T^R 7 


119 


TT7TTQ 




O.OC-UJ 




337 


zf-C2H2 


Zinc finger, C2H2 type 


3.6e-61 


216.6 


1/1 A 

J4U 


A TD C 


a. i k synmase reiaxea proiem 


4eoz 


190 9 


1/11 


■ 

annexia 


Annexm 


A £a on 

4.oe-5U 


970 A 


346 


Stathmin 


Stathmin family 


1 .8e-90 


314.0 


1 A—I 

347 | 


Ribosomal L16 


RiDosomaJ protein L16 


4.oe-u9 


ia a 


34o 


lactamase B 


Metauo-Deta- lactamase superxamuy 


0.012 


-O.U 


i«;i 


efhand 


dt nana 


"> <*» 1A 


61 0 


jjj 


lectin c 


If or^tiTi f^-HiWA s4/\tviair% 

i^ecuii o-iype aoniain 


l ,3e-u«> 


19 1 


15A 


\X7TYAO 
WJJ4U 


My U UUIUalXl, VJ-Deifl repeal 


9 9*»-l R 


74 5 


iao 


iipocann 


i^ipocaiuj / cyiosoiic iaixy-aciu 

\I 11 1* 1 «tI£j l** 


O.JC- 11/ 








Acetvltran^fprmip ^fiNATl fkmitv 


0 0019 


24 9 




♦"D XT A _M/n f 1 


tU^JA cvnthptncpc flacc I H I T^/f nnrl 
LXvi t ^tl sjriiuicuuva Wood A JU, i»l ulU 


A 6p-1SS 


628 2 




Cl 1 1 fain BA 


Qiilfatncp 


6 li»-998 


770 6 


36R 


CTAPT 


QTAPT Hrtmain 


j.OC" 1 1 


50 5 


369 

<m* \J 


UXVUKWw 


Eukarvotic brotein kinase domain 


2 4e-10 


41.3 


370 


ACBP 


Acyl CoA binding protein 


4.4e-56 


199.7 


371 


pkinase | 


Eukaryotic protein kinase domain 


1 .6e-94 


327.5 


373 


EGF 


EGF-like domain 


2.6e-12 


54.3 


375 


zf-C2H2 


Zinc finger, C2H2 type 


8.2e-64 


225.4 


377 


KRAB 


KJRAB box 


3.7e-27 


103.7 


379 


SET 


SET domain 


7.3e-61. 


215.6 


380 


Glyco_jransf 8 


Glycosyl transferase family 8 


0.0028 


-40.1 


381 


zf-C2H2 


Zinc finger, C2H2 type 


4.3e-06 


33.7 


383 


Glyco transf 8 


Glycosyl transferase f5amily 8 


0.0028 


-40.1 
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SEQID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


384 


RasGEF 


RasGEF domain 


8.1e-43 


155 7 


385 


TBC 


TBC domain 


0.017 


-66.6 


389 


G 1 y cos_jran sf_2 


Glycosyl transferases 


1.3e-15 


65 3 


390 


Na Ca Ex 


Sodium/calcium exchanger nrotein 


3.9e-105 


362 7 i 


391 


&i3 


Fibronectin tvne HI domain 


4 le-102 


352 6 


392 


fh3 


Fibronectin tvoe III domain 


3 4e-45 


163 6 


393 




Fibronectin type HI domain 


3.4e-45 


163.6 


394 




T rkix/— HpTiQitr/ liTiOTimtpiTi rprpntnr 

LjKMYV "UvlioJ IjT 1XLJ 1/1/1 U lVlll J vV»Cp IVJJ 

reneat 

X WUvUI 


7 1 P-4.Q 


1 fJ.o 


395 

%p J 


Ribosomal L30 


Ribosomal nrotein LlOn/I^e 




1 ft 0 
1 u.v 


396 


Oxvsterol BP 


Oxvsterol -hind in f? nrotein 

VATawil VI X v XXX \ J 1XXK L/l UlbUJ 


1 5p-04 


177 S 


397 


RDS ROM1 


Pprinb pwti /mm— 1 

ivl J Lyli wi Xll/X 1/lfl X 




171 0 


399 


luvUUllOOw d 


A/lpfm In— "nptn.lfiPt'fl'mjiQP QiinATffifnilv 
lviwuaiiu uciA laisUxiiiciav oujjcj jLcuniiy 




141 ft 






r-OUA U Villi Oil 1. 


0 0/lfl"> 








i^ip protease? 


4 8o_vC/l 

*Koe-o*t 


■77ft 7 


405 


Ae 


ISJ.LK/oULLlal pjv/lClll VjJ J t\K> 


oe-/ / 


7ft0 0 


406 


LIM 


1 TAr*l dnmatn containing nroteins 


0 00021 


20 7 


410 


tRNA-svnt lc 


tRNA synthetases class I fK and 


1 le-236 


799 8 


4ll 


NTP transf 2 


Nucleotidyltransferase domain 


3.9e-16 


67.0 


412 








17 7 I 
1 / «a» 


414 




"Oonuirn r»"f itnlninwn fimptinn T*>TTP04 
x.yi/1 iidi 1 1 i/i m miii/ wit iiuii^ui/ii i-/ujry*T 


o ooni i 


7ft O 

aO.7 


415 
ti j 


fiiHi ii in 


1 UL/U1X11/17 LO^f ICUXIlXjr 


4 5ft-780 


071 7 


420 


SFT ^ 


tJJUf 1 U 1/111 Cllii 




701 ^ 


421 




\X/ 1 1 Hntnain n»hpta rpnp^t 
w j_/ ui/iiiaixij \j Uvia icucdi 




100 ft 

1 1/7.1/ 


471 


*>1 *v£I1a 


7inr finopr tvnp 
Zjiiiv luigci | w<biiA ly ji/v 




1 *r*T«7 


424 
tit 


T\lf inticp 
pnuiooC 


IZiUJVUl jr L/llU |/1UICX11 K.lllaoC (lUJlldlU 


8 Qp-7^ 


7ft1 8 




1 TM 


J-illVl UUmaul Cl/IllaullDg pil/lClflS 


1 ftp.^4 

i .oe- j«* 


17ft 7 


411 


ftBMl 1 


K a7ft\— tvrv* cpnnp nrotpacp inhihrfm* 

XVa/XlJ~t_y l/C Swi XlXw pi UlCddC lllllIL/lU/J 

domain 

UV/1 1 111 1,11 


1 7p-18 

J. /C~ lO 


71 8 


432 


SH2 


Src homology domain 2 


1.4e-67 


198.4 


433 


zf-P2H2 


£ «lll V UllgCl| *miX»X1A Ijr pC 


2 Bp- 144 


402 7 


434 


ras 


Ras family 


0.012 


-106.8 


41ft 

t JU 


F1.F7 ATPacp 
x~> a 13^ niriuC 




1 £p 117 

1 .OB* 11/ 


101 0 


tj f 


RMA nnl A 


XVINAV pi/ljr lllCI <UC aipild j UUU 111 I 


O 
1/ 


1 077 7 


438 

tJO 


pfm 

X X 11V 


l XllV*llllgCl 


1 fip-1 1 
1 .OC- 1 1 


<»1 7 
J 1 . / 


439 


1 pcHn p 


X^Cl#LUl V^—iypo Ul/lxlolII 


4 7p-1fl 


1111 
I 1 J o 


440 




7inc fin opr f^2T42 tvnp 


1 1p.£S 


711 ft 


441 


Oil Colli J 


/VIICoUll \wl O ollugCll ^ 


7 0p-7S4 


8S8 1 


442 
t*fA 


cull juii/li ail j 


AtninnfmncfprncAe place. Ill 
/ULllllUUalialci cuCo Uiaoo 111 

ovridoxal-ob o 


8 7p-80 


711 1 


443 


UCH-l 


1 Jbiaiiitin carboxvi-terminal 

hydrolases famil 


8 5e-12 


52 6 


444 


CTF NFI 


CTF/NF-I family 


2.6e-277 


934.6 


451 


T-box 


T-box 


3.8e-117 


402.6 


453 


Rieske 


Rieske [2Fe-2S] domain 


2.6e-13 


57.7 


454 


zf-C2H2 


Zinc finffer C2H2 tvne 


3 9e-64 


22ft 5 


456 


homeobox 


Homeobox domain 




38 9 


459 




Immunoglobulin domain 


2.6e-20 


70.5 


460 


Hydrolase 


haloacid dehalosenase-like hydrolase 


4e-25 


96 9 


462 


rve 


Integrase core domain 


1.6e-13 


50.7 


466 


CH 


Calponin homology (CH) domain 


2.4e-17 


71.1 


467 


CH 


Calponin homology (CH) domain 


2.4e-17 


71.1 


468 


Sterol desat 


Sterol desaturase 


7.5e-38 


139.2 


469 


pro__isomerase 


Cyclophilin type peptidyl-prolyl cis- 
tr 


2.6e-63 


220.9 


470 


Peptidase M24 


metallopeptidase family M24 


6e-08 


28.1 


471 


PDZ 


PDZ domain (Also known as DHR or 
GLGF). 


5,4e-129 


441.9 
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NO: 


PI? AM NAlVflT 




p-value 


PF AM 
SCORE 


472 


mvb DNA- 
bindinc 


Mvb-like DNA-bindinf> domain 


3 6e-06 


0 


473 


77, 


Zinc fineer oresent in dvstronhin CB 


0 012 


*20 0 i 


474 


EF1G domain 


Elongation factor 1 pamma. 
conserved doma 


6 3e-88 




475 


Ribosomal L31e 


Ribosomal nrotein L31e 


6 le-66 




476 


Cla 


Cla domain 


2 5e-75 




477 


SH3 

A Aw 


SH3 domain 

M.aJ UVillUlU 


1 le-17 

i - A C 1 X* 




478 


MoaA NifB Pa 
aE 


mnn A / nifR / nnnF familv 
iiiu><u*t / iii 1X3 i ^j\^\^sz> ickXiixiy 


U.vuX 


1*7 9 


479 


FYVE 


FYVE zinc fincer 

AAV M~) AiUIV AUigwl 


7tJC"Xi A 


/ O.v 


480 


DNA doI A 


ON A nfllvm^ra^ ftmilv A 
^jvji yiuvi ooc AcUAiiijr **■ 


l.jc tvl 


1/C7 J 
lO/.t 


482 


adh short 


«sViArt chain rfphvn'mo'Pnacp 






483 


ank 


Ank repeat 


1.30-17 


71.9 


484 

tot 


AAYAO 


iiapDfinustMSaiuiy rainiiy 


z.ze-oi 




486 

tow 


TrR 


1 AAV UULIi Hill 




ft 

O / .0 


487 


X ATA, 1 AIV*5 


JT mVUl-UUlvAiJlg inonooxygcnaAO~AAK.6 


A 
V 


14aJiJ 


488 


IJLWEQ 


I/LWEQ domain 


9.5e-101 


341.0 


495 


Horn Pftnnv 
i I WILL OUUUA 


aaUXIICUUUA uQIXlain 


j.OC-l/O 


"3A S 


497 


pkinase 


Eukaryotic protein kinase domain 


2.3e-166 


566.1 


400 
tyy 


m^> 


r lDronectm type ill domain 


z.De-ij / 


oAi 0 
0OI.0 


Jul 


JUivtV 


Leucine ixicn xvepear 


O Q A O 1 

y.^e-J 1 


lie / 

1 1 5.0 


509 


xvUo 


Keguiaior or o protein signaling 
Qoiniiin 


A A>1 1 


1 1 A 

11.9 


sot 


IlicuIIClll 


intcnneuiaie riiament proteins 


ie-i4x 


A on K 


50s 


1 ' 1 ~ 


rioronecun type 111 aomam 


1 .3C-1UU 


/ 






nxiv^ i -uomaui ^uoiquiun- 
transferase). 


loll 


CO A 


507 


IvlDOSOlUaJ__Ls / /\ 
c 


jKJuosomai protein / Ae 


D.ve-xo 




508 


WD40 


wu uurnaiUy vj-dc La repeal 


U.UDJ 


IO ft 


509 




Vr J_y UC/illalUy U trCla IcpCai. 




10 ft 


510 


WD40 


VY VJ UUJila-llI, VJ~DCla IcpCal 


X. J C"t4 


1 

j j**. j 


511 

mf A A 


nkinase 


PliilfnrvnHR nr^fptn IrinACA Hnm a in 

LjUIUU JfULlW £J1U(-CU1 IWlAAOdW UUiilulU 


9 1p-86 




512 


Ijtll II 1 lilt 


fiHT HnmFiin 

VJVJ1_« UUlAlCUll 


1 Op-OS 


lit T 
Jt.j 


513 ' 


SH3 

Ul AJ 


onj uvjiiiaiii 


JC*UO 


Jt,Z 


515 


HTH AraC 


Rnctf*rial tp c^ii lntnrv ripliv-tiirn— hi*liv 

AJlAw Lwl J (Zl IvKUUUUI V AlwllA^llU U UwllA 

nratei 


*? Op-97 




516 


zf-C2H2 

^^A# A 1 ■ * 


Zinc finder tvne 


1 7e-34 

1 • # w Jt 


198 0 


517 


SI 


SI RNA hindinir rinmain 


6 le-58 

VJ . Aw JO 




518 


nkinase 


Eukarvotic orotein kinase domain 

JUUXwU T VUv k/A \J Lv A AX lUilUvV VIVA! IllH^ 


1 8e-75 

A • O W / m/ 


264 2 


525 


cadheriD 


Cadherin domain 

x<u-mai wft i > j mm 


2e-80 


280 6 


528 


zf-C2H2 


Zinc fin&er C2H2 tvoe 


4e-70 


246 4 


529 


neur chan 


Neurotransmitter-gated ion-channel 


5.8e-222 


750.8 


531 


RhoGEF 

A \U ^p» P* 


RhoGEF domain 


3 5e-44 


160 2 


532 


myosin_head 


Myosin head (motor domain) 


0 


1494.5 


533 

~J mmJ mJ 


TRR 

AJ1\1\ 


T Mipinf* Rich l?*»npat 

JuvUvUlv AX1V11 AVvLICOV 


O.JC'l J 


69 t\ 


535 


Sec7 


*^Ipp7 Hnm a in 




HO 1 


536 


hnm enhny 


nuiUvUWA uujJiaiii 




ZO.t 


539 


actin 


Actin i 

A »VkLIA 


2 4e-100 

*»«t*rf A W 


330 6 


542 


ank 


Ank repeat 


1.9e-35 


131.2 


544 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H type 


2.8e-10 


41.7 


546 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


2.4e-40 


147.4 


547 


HMG__CoA_synt 


Hydroxymethylglutary 1-coenzyme A 
synthas 


0 


1250.8 


549 


lam in in G 


Lam in in G domain 


3.3e-76 


266.6 


551 


PHD 


PHD-finger 


0.008 


9.3 


552 


PDZ 


PDZ domain (Also known as DHR or 


0.0017 


25.0 
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aty ID 
NO* 


PFAM JNAME 


DESCRIPTION 


p-value 


PFAM 






GLGF). 






555 


WW 


VV W UUUloUI 


1 1o OA 


O^ ^ 


558 


kinesin 


Kinesin motor domain 


1.8e-176 


599.7 


j j j 




Z/inc ringer, v^jrii^f type ^jkjunvj 
ringer^ 


U.OUVOJ 


io.5 




CLDaJlU 


cr nana 


/.ye-i i 


AO A 

4y.4 


Ju / 


PUT 


rTl QOUlain 


/.oe-oo 




SAR 
JOO 


.rxl 


rn domain 


J.le-39 


143.8 




riist^aeaceiyi 


riistone deacetyiase tamiiy 


5.2&-106 


365.6 


^Tft 




rlJzi domain (Also Known as DHK or 
GLGF). 


3.4e-20 


80.5 


J / 1 


ZI-C,>riC4 


/*inc ringer, C3HC4 type (KING 
finger) 


le-lo 


CO c 

58.5 


J / J 


uoiquiun 


Ubiquitin family 


1.4e-08 


31.1 


574 


FH2 


Formin Homology 2 Domain 


l.3e-l 10 


380.9 


57o 


• 

serpin 


Serpuis (serine protease inhibitors) 


4.3e-l46 


496.4 


575* 


ZT-CZH2 


Zinc ringer, C2H2 type 


5.7e-76 


265.8 


580 


p kinase 


Eukaryotic protein kinase domain 


6.9e-79 


275.5 


CO 1 

581 


RnoGAP 


RnoGAP domain 


4.4e-53 


189.8 


582 


RibosomaIJL7A 
e 


Ribosomal protein L7Ae 


0.028 


1.0 


CO A 

584 


kazal 


Kazal-type senne protease inhibitor 
domain 


2.2e-52 


187.4 




LlKK 


Leu erne Rich Repeat 


4.4e-28 


106.7 


JoO 


rriD 


PHD-finger 


3.8e-12 


A*** O 

53.8 


JOO 


IjIJrl OBG 


GTP1/OBG family 


l.le-62 


A+\ m A 

215^2 




Collagen 


Collagen triple helix repeat (20 
copies) 


8e-42 


■1 A*^% Al 

152.4 


591 


lys 


C-type lysozyme/alpha-Iactalbumin 
family 


1.6e-31 


116.4 






Acyi coa Dinding protein 


0.0022 


-9.4 


jy / 


oJNrx IN 


oNrz ana otners n -terminal domain 


3.7e-yo 


339.5 


600 


KRAB 


KRAB box 


1.3e-29 


111.8 


OUO 


T nn 

LKK 


Leucine Rich Repeat 


le-05 


32.5 


OU / 


LKK. 


Leucine Rich Repeat 


1 — Af 

le-05 


32.5 


608 


WD40 


WD domain, G-beta repeat 


5.3e-23 


89.8 


Oil) 


cpnoo_rcFi 


TCP-l/cpno0 chaperonm family 


1.7e-237 


802.4 


OlJ 


1 Hr_DHG_C Y 

XI 

Jtl 


Tetrahydroiolate 
dehydrogenase/ cyciony dro 


4.9e-173 


588.3 


01 / 


11 LU 


KJNA recognition raotir. 


4e-14 


60.4 


Olo 


I rill 


KlNA recognition raotit. 


4e-14 


60.4 




COnilll_AlJr 


uoiiun/tropomyosin-type aenn- 
uinuing pr 


3e-0o 


34.2 


*591 


IN Op 


ruianve snoKJNA oinomg domain 


o.ie-y3 


JZo.o 






uoiquiun carooxyi-iennmai 

iiyui uiiuc laiiiiiy 


j.oe-zi 


OJ.l 


625 




£«iui> ungcr, w^xix type 


^•3e-i^&'* 


HAvO.1 


628 


DEAD 






*510 ft 


632 


OST 


vjiuiauiiune o~ixaiioierasco. 


4.oe-zo 


SO ft 


633 


5 nucleotidase 


5 '-nucleotidase 


6 6e-248 


837 0 

OJ / .v 


636 


LIM 


LIM domain containing proteins 


1.6e-88 


307.5 


637 


pkinase 


Eukaryotic protein kinase domain 


1.5e-73 


257.8 


638 


MSP domain 


MSP (Major sperm protein) domain 


8.4e-09 


42.7 


639 


metalthio 


Metallothionein 


2e-24 


94.6 


641 


zf-C2H2 


Zinc finger, C2H2 type 


6.1&-114 


391.9 


642 


Ribosomal S28e 


Ribosomal protein S28e 


9.3e-48 


172.1 


643 


RibosomaJ S5 


Ribosomal protein S5 


8.3e-87 


301.8 


646 


PHD 


PHD-finger 


0.00025 


23.1 


647 


WD40 


WD domain, G-beta repeat 


1.5e-22 


88.4 
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seo n> 

NO: 


PFAM NAIVTE 


DESCRIPTION 


p-value 


AM 
SCORE 


648 


Lipase_GDSL 


Lipase/Acylhydroiase with GDSL- 
like motif 


0 015 


2 2 


652 


zf-C2H2 


Zinc finger, C2H2 type 


4.1e-146 


498 8 


653 


histone 


Core histone H2A/H2B/H3/H4 


1.2e-10 


48.8 


654 


zf-C2H2 


Zinc finger. C2H2 type 


1 9e-87 


303 9 


655 


ras 


Ras family 


6.4e-77 


269.0 


657 


zf-C3HC4 


Zinc fineer C3HC4 tvne fRING 
finger) 






658 


STphosphatase 


Ser/Thr orotein Dhosohatase 


2 6e-l 82 


610 1 


659 


zf-C2H2 


Zinc fineer C2H2 tvoe 


1 3e-92 


^71 1 

1.1 


660 


2f-C2H2 


Zinc fineer C2H2 tvne 


X •JC*OJ 


707 /\ 


662 


NDK 


Nucleoside dtnhosnfaate kinases 


1 4 e . 1 1 o 


41/17 


664 


IRF 


Interferon re<ni1fltorv factor 
transcrintion f 




7Q ^ 


665 


4HPPD C 


4-hvdroxvnhenvlnvnivflte 

dioxygenase C term 


1 4e-1fi 

1 ttC i u 


DO. J 


666 


DEAD 


DEAD/DEAH box helicase 


4 Re-74 


917 1 


667 


DEAD 


DEAD/DEAH box helicase 


2 9e-70 


77 5 1 


669 


pkinase 


Eukaryotic protein kinase domain 


6.1e-93 


322.2 


671 


homeobox 


Homeormv Hnmnin 

AlUUlvwWA UwllUUll 


n ni r 

U.U 1 o 


IO.J 


678 


crvstall 


Betn/fiammft crvctnllin 
n/ v '"""'in vi Yjiiii mi 


*t. / IT* J V/O 




679 


WD40 


WTJ H nmain fr-hptf* rpnpat 


1 Of-Ofi 
i .7C*UO 


^4 O 


680 


Keratin B2 


Kerfitin Hioh Qiilfiir R7 Tvrot#»in 


*r. 1 CUO 


ICQ 
i«>.7 


682 


G-Eammfl 


Gf»T. domain 


O.JC J J 




685 


UCH-2 


TJhidiiTTtn rnrhnYvLtprminflt 
Hvdrolase familv 




1117 
111./ 


686 


Acetyltransf 


Acetyltransferase (GNAT) family 


6.6e-10 


46.4 


687 


7tm 1 

9 111 i 1 & 


ffunilvl 

■ ' " M 11 T ff 




A 


688 


proteasome 


Proteasome A-tvne and B-tvne 


6 5e-64 


77 S 7 


689 


SCP2 


SCP-2 sterol transfer family 


6.2e-37 


136.1 


690 


TS-N 


XS-N riomafn 

A k»J X ™ UUIliUUJ 


n 04.1 


70 1 


692 


zf-C2H2 


Zinc finder C7H^ tvne 




7110 

X 1 1 .7 


693 


zf-MYND 


MYND fineer 


U.uJO 


s s 

J. J 


694 


Oxysteiol BP 


Oxvsterol-bindinff nrotein 

0 *W W A UUHUI l£j JLf A V 1 Will 


3 Oe-133 


4SS 7 


695 


PDZ 


PDZ domain ^Also known fl^ or 

GLGF). 


I .jo*jv 


11^1 


703 


Peptidase C2 


Calnain familv cvsteine nrotea^p 




J7D.U 


706 


filament 


Intermediate filament nroteins 


7 7e-107 




710 


fibrinogen C 


Fibrinogen beta and pamma chains 
C-term 




27R 0 


711 ■ 


SH2 


Src homology domain 2 ! 


2.3e-65 


192 1 


712 


ATP-synt DE 


ATP synthase. Delta/Eosilon chain 


0.00062 


19 0 


713 


ARID 


ARID DNA bindins domain 


2e-17 


71 3 


714 


LBP BPI CETP 


LBP / BPI / CETP family 


8.6e-34 


125.7 


715 


RNA_pol_L 


RNA Dolvm erases L / 13 to 16 kDa 
subunit 


4 8e-49 


176 3 


716 


KRAB 


KRAB box 


1 3e-42 


155 0 


717 


mi to can* 


^Mitochondrial carrier rrroteins 

I'tiH/wuwuut nil vulllwi }JLIULwUJO 






719 


Gal-bind lectin 


Vertebrate ealactoside-bindinc lectin 


1 .5e-25 


90 2 


726 


aldedh 


Aldehyde dehydrogenase family 


1.3e-119 


410.8 


728 


Glycos transf 2 


Glycosyl transferases 


4e-21 


83.6 


734 


ELM2 


ELM2 domain 


2e-34 


127.8 


735 


PR55 


Protein phosphatase 2A regulatory 
subunit PR 


0 


1038.2 


737 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


4e-14 


60.4 


740 


WD40 


WD domain, G-beta repeat 


5.6e-14 


59.9 


745 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 


3.8e-13 


46.9 
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SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 






nngerj 






740 


milO CoTT 


iviuocnunuriui Carrier proteins 


a <o> fin 
**. oe-o / 


710 ft 






uoiriajii oi unKJiown iuncnon uurz/ 


*f.De-iz 




751 


QUI 


oxIj a o ma in 


3.oe-i / 




7</> 


MIVlO DOX 


fiivio (nign mooiiiry groups oox 


o.oe-13 


CC A 


/jj 


orKY 


orKi aomain 


j.ye-U5 






Cj Lr CDC 


ceil division protein 


7.5e-153 


CO t o 




mito can* 


Mitochondrial carrier proteins 


3e-88 


1 AC ii 

305.4 


756 


TSPN 


Thrombospondin N-tenninal -like 
domains 


8.1e-58 


205.5 


157 


BTB 


BTB/JrUZ, domain 


5.7e-23 


89.7 


*7CA 

75V 


r~ a"%*% rm * 

zf-C2H2 


Zinc linger, C2H2 type 


1.2e-12 


55.4 


760 


\Tnr< 

NSF 


NSF attachment protein 


6.4e-127 


435.1 


762 


Ribosomal S14 


Ribosomal protein S14p/S29e 


2.1e-06 


24.8 


765 


ThiF_family 


ThiF family 


1.7e-39 


144.6 


766 


DnaJ 


DnaJ domain 


3.9e-36 


133.5 


768 


tRNA-synt_2b 


tRNA synthetase class II 


9.1e-81 


281.7 


769 


ldl_recept_a 


Low-density lipoprotein receptor 
domam 


0 


1404.5 


770 


WD40 


WD domain, G-beta repeat 


2e-21 


84.6 


771 


LRR 


Leucine Rich Repeat 


3.8e-06 


33.9 


774 


SNF2 N 


SNF2 and others N- terminal domain 


5.5e-99 


342.3 


776 


VPS9 


Vacuolar sorting protein 9 (VPS9) 
domain 


l.le-30 


115.4 


111 


VPS9 


Vacuolar sorting protein 9 (VPS9) 
domam 


l.le-30 


115.4 


mm mm 

778 


VPS9 


Vacuolar sorting protem 9 (VPS9) 

• • 

domam 


l.le-30 


115.4 


779 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


3.1e-08 


31.0 


781 


cadnerin 


Cadnerin domain 


Am +9 m ^ Am* 

5.6e-113 


mm mm fm smm 

388.7 


783 


HECT 


HECT-domam (ubiquitm- 
transferase). 


m mm, 4Mk m 

4^e-31 


116.8 


785 


sushi 


Sushi domam (SCR repeat) 


].8e-60 


214.3 


7oo 


sushi 


Sushi domam (SCR repeat) 


1.8e-60 


214.3 


TOO 

/oo 


vwa 


von Willebrand factor type A domain 


1.9e-52 


187.7 


lOA 

/yu 


rrm 


RN A recognition motif. 


2.8e-20 


OA O 

80.8 


yyi 


Collagen 


Collagen triple helix repeat (20 
copies^ 


0.00097 


A O 

9.7 


/ZfZ 


■ , _ . 
pkinase 


AiuKaryouc protem Kinase domain 


a nil 


1 


70< 




zinc nnger, czrtz type 


o.5e-y5 




7QA 


a cm snort 


snort cnam uenyorogenase 


/< 1 _ AC 




7QQ 

/yy 


oAiLAiv sync 


C ATPAB D\mthnt<ico i 
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Bacterial type U secretion system 
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3*5 -cyclic nucleotide 
phosphodiesterase 
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PH 
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822 


CNH 
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RNA recognition motif. 
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TABLE 6 



SEQID 
NO: 


Method 

* 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A n Alanine 0=Cysteine, D=*Aspartic Add, 
E=GIutamic Add, FHPhenyl alanine, G=Clydne, H«Hlstidine, 
I=*IsoIeudne, K-Lysine, L»Leudne, M-Methionine, 
NeAsparagJne, P=Proline, Q^Glutamine, R=Arginine, S^Serine, 
^Threonine, V»ValIne, W=Tryptophao, Y^Tyroslne, 
X«Unknown, *=Stop codon, /^possible nucleotide ddetion, 
Vspossible nucleotide insertion 


2953 


A 


3 


324 


ISEHRIEASGNYLAQIU-TSSI^GI^SWKSWLML 
CGWmLTLTMVQGEP*GP\KGIPG\FrTTNSSYPH 
WGTVAKPPAGD* DLLPAPGQEGTPLFTR*SLCTY 
CPID 


2954 


A 


18 


467 


REELGKIJLFDCTLYVLLKYDDFNADKHLALEEF 
YRAFQVIQLSLPEDQKLSITAATVGQSAVLSCAIQ 
GTUlPPUWKRNNIILN>n^DLEDIND L YTT 
KVTTTHVGNYTCYADGYEQVTQTHIFQVNVPPV 
IRVYPESQAKRAG 


2955 


A 


3 


23 


r^SAFLVADKGIVTSKHNNDTQHIWESDSNEFSV 
IADPRGNTLGRGTTIT*VSIPPSL 


2956 

♦ 


A 


1 


493 


RTKTDVY1LN1JVVAJDLL1XFTLPFWAVNAVHGW 

VLGKIMCKITSALYTLhnFVSGMQFLACISIDRYV 

AVTKYPSQSGVGKPCWnCFCVWMAAILLSIPQL 

VFYTVNDNARCIPIFPRYLGTS1V1XAIJQMLEICIG 

FVWFLIMGVCYRTARTLMKMPNIKIS 


2957 


A 


703 


302 


EETGVREKRRERMKEKMWQNVLCCTLQTAVEL 

KIJQNKVLl^KOTFl^PLDTRK^ 

PGAVAHACNPSTLGGRGGR1TKSGDRDHPGQHG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

tn first o m inn 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 

nriH rptlHn* nf 

HblU ■ WlUUC VI 

peptide 
sequence 


Amino acid sequence (A«Aranlne OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-Pheoylnlanlne, G=Glydne, H=Histidlne, 
I»IsoIeucine, K= Lysine, L^Leudne, M=Methionine, 
N»Asparag!ne, P=Pro1ine, Q=Glu famine, R=ArgJnine, S^Serinc, 
T=Threonine, V=Valine, W«Tryptophan, Y=Tyrosine, 
A'-unnnonn, "awp cuoon t /=po3sioie nuciconuc aeiciion, 
V»possible nucleotide insertion 

• 










ETRSLPACWAQWKSLAJLPVSRAPGRQGSLWFP 
LP 


2958 


A 


575 


1054 


CTKCKADCDTCFNKNFCTKCKSGFYLHLGKCLD 

NCPEGLEANNHTMECVSIVHCEVSEWNPWSPCT 

KKGKTCGFKRGTETRVREEIQHPSAKGNLCPPTN 

ETRKCTVQR2CKCQKGERGKKGRERKRKKPNKG 

ESKEAIPDSKSLESSKEIPEQRENKQQQ 


2959 


A 


1 


426 


LSMLSTISTEHIU.SVLWPIWYCCTCPTHLSAVMC 
VLLWALSLLQSILEWMFCSFLFSDVDSDNWCQIL 
DFLTAVVVLIFLIVLVLCGFTLVLLVRnCGSQKMPL 
TRLYVTILLTGLVFLFCSLPLSIQ*FLLYWIEKDLD 
DL 


2960 


A 

• 


1194 


852 


I EKRKTSYSQCLNSKQRNVSMRPSIWIHVHLKPPC 
RJLVELLPFSSALQGLSHLSLGTTLPA^*GHLRFRL 
RNLPQSLRTVILPERNEEQNLQELSHNADKYQM 
GDCCKEEIDDSIFY 


2961 

* 


A 

* 


274 


2250 


EKGKVKDAGAEQWISLSLSCKGSWETQFSNHLN 

SLTPF^SVRRMPUTTVTLIJCMVARHHMKLLCSK 

AFSTQLQQKIFLHSQMGIHHQSVCMKLKPNTSHn 

SILMGQPMALVQLETLAPLTIQQKFQTQDHMKF 

WKNLPLHSHHLTPSWQTVrPKKTGSPEIKLKJTK 

TIQNGRELFESSLCGDLLNEVQASE\Q*NQSIESRK 

EKRKXSNKHDSSRSEERKSHKIPKLEPEEQNRPN 

ERVDTVSEKPREEPVLKEGSPSSANTIFCSNNGSV 

HWNFKFQVGDLVWSKVGTYPWWPCMVSSDPQL 

EVHTKINTRGAREYHVQFFSNQPERAWVHEKRV 

REYKGHKQYEELLAEATKQASNHSEKQKIRKPR 

PQRERAQWDIGIAHAEKALKMTREERIEQYTFIYI 

D KQPEEALS Q AKKS V A SKTE VXKTRRPRS VLNT 

QPEQTNAGEVASSLSSTEIRRHSQRRHTSAEEEEP 

PPVKIAWKTAAARKSLPASITMHKGSLDLQKCN 

MSPWKDEQVFALQNATGDGKFIDQFVYSTKGIG 

NKTEISVRGQDRLHSTPNQRNEKPTQSVSSPEATS 

GSTGSVEKKQQRRSIRTRSESEKSTEVVPKKKIK 

KEQVETVPQATVKTGLQKGSADRGVQGSVRFSD 

SSVSAAIEETVD 


2962. 


A 


2408 


836 


SASPPPPPPPPPSRFPFSGAPGARDRSGPLGSEPQR 

NPGARPRTLEATVTPPGSVGAMSSSGLNSEKVA 

ALIQKLNSDPQFVl^QNVGTTHDLLDICLKRATV 

QRAQHVFQHAVPQEGKPITNQKSSGRCWIFSCLN 

VMRLPFMKKLNEEEFEFS QS YLFFWDKVERC YFF 

LSAFVDTAQRKEPEDGRLVQFLLMNPANDGGQ 

WDMLVNIVEKYGVIPKKCFPESYI"rEATRRMND 

ILNHKMREFCIRLRNLVHSGATKGEISATQDVM 

MEEIFRVVCICLGNPPETFTWEYRDKDKNNKKIG 

P\ITPLEFNR/EQHVKPLFNMEDKICLVNDPRPQH 

KYNKLYTN^YLVSNMVWRGEKLFYNNQPIDFLK 

KMVAASIKIX3\EAVWFGCDVGKHFVNSKLG\LSD 

MNLYDHELWGVSLKNMNKAERXLTFGESVLMT 

HTMTFTAV/SQSRDDSGMVLFTKWNRVGEFQWG 

EDHGHVKGYLCMTD*VGSLEYVYEVVA^WDRKH 

VPNEEVLAVLGAGNPFVLPAWDPMGALAE 


2963 


A 


90 


543 


RHYDSAGKITLKIAKNYLEQRAVGGASPRLAQS 
VLTCSREPILENSLTSIJEYLH>MJLEHDMRLRFNN 
DRMKTTIKETST* LSNSYL VFPLM* SLTYLMKMS 
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SEQW 
NO: 



Method 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 



Amino add sequence (A=AIanine OCysteine, D=Aspartfc Add, 
E=€lutaraic Add, ^Phenylalanine, G=Gtycine, H°HisHdine, 
I»IsoIcuctac, K-LysJne, L^Leudne, M~Mcthionine, 
N=>Asparagine, P=Proline, Q=Glutamine, R=Arglnine, S^Serine, 
T»Threontne, V«Valine, W«=Tryptophao, Y«Tyrosine, 
X^Un known, *=Stop codon, /^possible nudcotide deletion, 
^possible nudcotide insertion 



FERCTAKNKMFVNSPFTKVDNYCT\SS\WKICFyL 
KCYFSLNTIKKEKKMT 



2964 



2454 



FDTYRGLPSISNGNYSQLQFQAREYSGAPYSQRIS 

AITTVSVAWKVLSGKIGEGAEGNCKCVISEGAW 

AVCPTQPCGKAKPDKHLKDLLSKLLNSGYFESIP 

VPKNAKEKEVPLEEEMUQSEKKTQLSKTESVKE 

SESLMEFAQPEIQPQEFLNRRYMTEVDYSNKQGE 

EQPWEADYARKPNLPKRWDMLTEPDGQEKKQE 

SFKSWEASGKHQEVSKPAVSLEQRKQDTSBCLRS 

TLPEEQKKQEISKSKPSPSQWKQDTPKSKAGYVQ 

EEHKKQETPKLWPVQLQKEQDPKKQTPKSWTPS 

MQSEQNTTKSWTTPMCEEQDSKQPETPKSWENN 

VESQKHSLTSQSQISPKSWGVATASLTPNDQLLPR 

KLNTEPKD VP/IA C A S A ♦ GrXPLQPPFRRI/HVLRK 

EKLQDLMTQIQGTCNFMQESVLDFDKPSSAIPTS 

QPPSATPG* PRRHLKEQNLSWKVIFFQGA VTVVF 

NVNAPLPPRKEQEIKESPYSPGYNQSFTTASTQTP 

PQCQLPSIHVEQTVHSQETANYHPDGTIQVSNGS 

LAFYPAQTNVFPRPTQPFVNSRGSVRGCTRGGRL 

ITNSYRSPGGYKGFDTYRGLPSISNGNYSQLQFQ 

AREYSGAPYSQRDNFQQCYKRGGTSGGPRANSR 

AGWSDSSQVSSPERDNETFNSGDSGQGDSRSMT 

PVDVPVTNPAATELPVHVYPLPQQMRVAFSAAR 

TSNLAPGTLDQPIVFDLLLNNLGETFDLQLGRFN 

CPVNGTYWIFHMLKLAVKVPLYVNLMKI4EEVL 

VSAYANDGAPDHETASNHAILQLFQGDQIWLRL 
HRGAIYGSSW 



2965 



2454 



2966 



1693 



FDTYRGLPSISNGNYSQLQFQAREYSGAPYSQRIS 

AITTVSVAWKVLSGKIGEGAEGNCKCVISEGAW 

AVCPTQPCGKAKPDKHLKDLLSKLLNSGYFESIP 

VPKNAKEKEVPLEEEMLIQSEKKTQLSKTESVKE 

SESLMEFAQPEIQPQEFLNRRYMTEVDYSNKQGE 

EQPWEADYARKPNLPKRWDMLTEPDGQEKKQE 

SFKS WE ASGKHQE VSKPA VSLEQRKQDTSKLRS 

TLPEEQKKQE1SKSKPSPSQWKQDTPKSKAGYVQ 

EEHKKQETPKLWPVQLQKEQDPKKQTPKSWTPS 

MQSEQNTTKSWTTPMCEEQDSKQPETPKSWENN 

VESQKHSLTSQSQISPKSWGVATASLIFNDQLLPR 

KLNTEPKDW/IACASA*GFLPLQPPFRRI/HVLRK 

EKLQDLNfTQIQGTCNFMQESVLDFDKPSSAIPTS 

QPPSATPG*PRRHLKEQNLS\VK\OFFQGA VTVVF 

NVNAPLPPRKEQEIKESPYSPGYNQSFTTASTQTP 

P(^QLPSIHVEQTVHSQETANYHPDGTIQVSNGS 

l^J 7 YPAQTrWFPRPTQPFVNSRGSVRGCTRGGRL 

rmSYRSPGGYKGFDTYRGLPSISNGNYSQLQFQ 

AREYSGAPYSQRDNFQQCYKRGGTSGGPRANSR 

AGWSDSSQVSSPERDNETFNSGDSGQGDSRSMT 

PWVPVTNPAAmPVHVYPIJKKJMRVAFSAAR 

TSNLAPGTLDQPIVFDLLLKNLGETFDLQLGRFN 

CPVNGTYVFIFHMLKlJW>rm/^^ 

VSAYANDGAPDHETASWHAILQLFQGDQIWLRL 
HRGAIYGSSW 



227 



DYVLTAELHRQRSPGVSFGLSVFNLMNAIMGSGI 

LGI^YVMANTGVFGFSFLLLTVALLASYSVHLL 

LSMCIQTAYLGP»TNYFMVLPAH*LTCLPLIEFLQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=A)anine OCysteine, D=Aspartic Add, 
E=Glutamic Acid, ^Phenylalanine, G=Giycine, H-Htstidine, 
I»Isoleucine, K=Lysine, L-Leudne, M=Methionlne, 
N=Asparagine, P^Prollne, Q=Clutamlne, R=»Arginine, S^Serinc, 
T=Th reonine, V«VaIi ne, W=Tryptopban, Y«Tyrosi ne, 
X=Unknown, *=Stop cod on, possible nucleotide deletion, ' 
V^possible nucleotide insertion 






• 




SL*NSL\*AVTSYEDLGLFAFGLPGKLWAGTIIIQ 

NIGAMSSYLLDKTELPAAIAEFLTGDYSRYWYLD 

GQTLLniCVGIVFPLALLPKIGFLGYTSSLSFFFM 

MFFALWIIKKWSJPCPLTLNYVEKGFQISNVTDD 

CKPKXFHFSKESAYALPTMAFSFLCHTSJDLPIYCE 

LQSPSKKRMQNVTNTAIALSFLIYFISALFGYLTF 

YD/GTTKAQRGEVTCHRIKDKVESELLKG* * *IP* 

SHDVVVMTVVKLCILFAVLUTVPLIHFPARKAVT 

MMFFSNFPFSWlRHFLITLALhn^MLLATYVPDIRN 

VFGWGASTSTCLIFIFPGLFYLKLSREDFLSWKK 

LGVGCFC/LLSFKTSBLRNSLSVYIILPASRKSIYFIC 

I 


2967 

* 


A 

t 


3 


3222 

• 


SGIWRALWREKKPGG GRR VKRRNPGRQ A VGH 

TEEDPPRVGTPWKEHTGPGPQEGSTMEAAHAKT 

TEECLAYFGVSETTGLTPDQVKRNLEKYGLNELP 

AEEGKTLWELVIEQFEDLLVRILLLAACISFVLA 

WFEEGEETITAFVEPFVILLIL1ANAIVGVWQERN 

AENAIEALKEYEPEMGKVYRADRKSVQRIKARD 

IWGDIVEVAVGDKVPADIRILAIKSTTLRVDQSIL 

TGEYVSVIKHl'EPVPDPRAVNQDKKNMLFSGTNI 

AAGKALGIVATTGVGTE1GKIRDQMAATEQDKT 

PLQQKLDEFGEQLSKVISLICVAVWLENIGHFNDP 

VHGGSWFRGAIYYFKIAVALAVAAIPEGLPAVIT 

TCLALGTRRMAKKNAIVRSLPSVETLGCTSVICS 

DKTGTLTTNQMSVCKMFIIDKVDGDICLLNEFSIT 

GSTYAPEGEVLKNDKPVRPGQYDGLVELATICA 

LCNDSSLDFNEAKGVYEKVGEATETALTTLVEK 

MNVFNTD VRSLSK VERANACN S VIRQLMKJCEFT 

LEFSRDRKSMSVYCSPAKSSRAAVGNKMFVKGA 

PEG VIDRCNYVRVGTTRVPLTGP VKEKIMA VTKE . 

WGTGRDTLRCLALATRDTPPKREEMVLDDSARF 

LEYETDLTFVGWGMLDPPRKEVTGSIQLCRDA 

GIRVIMITGDNKGTAIAICRRIGIFGENEEVADRA 

Y\TGREFDDL\PLAEQ\REACRRACCFARVEPSHK 

SKIVEYLQSYDEITAMTGD G VNDAPALKKAEIGI 

AMGSGTAVAKTASEMVLADDNFSTIVAAVEEGR 

AIYNNMKQFIRYLISSNVGEVVCIFLTAALGLPEA 

LIPVQLLWVNLVTDGLPATALGFNPPDLDIMDRP 

PRSPKEPLI\SG WLFFRYMAIGG YVG AATVGAAA 

WWFLYAEDGPHVNYSQLTHFMQCTEDNTHFEGI 

DCEWEAPEPMTMALSVLVTIEMCNALNSLSEN 

QSLLRMPPWVNIWLLGSICLSMSLHFLIL YVDPLP 

MIFKLRALDLTQWLMVLKISLPVIGLDEILKFVA 

RNYLEG*LFPLLHL*ARVTDPEDERRK 


2968 


A 


3 


2414 


GARSCSRLGRCTFPLWKGREMEVRKLSISWQFLI 

VLVLILQILSALDFDPYRVLGVSRTASQADIKKA 

YKKLAREWHPDKNKDPGAEDKFIQISKAYEELSN 

EEKRSNYDQYGDAGENQGYQKQQQQREYRFRH 

FHENFYFDESFFHFPFNSERRDSIDEKYLLHFSHY 

VNEVAPDSFKKPYLIKITSDWCFSCIHIEPVWKEV 

IQELEELGVGIGWHAGYERRLAHHLGAHSTPSI 

LGIINGKISFFHNAWRENLRQFVESLLPGNLVEK 

VTNKNYVRFLSGWQQENKPHVLLFDQTPIVPLL 

YKLTAFAYKDYLSFGYVYVGLRGTEEMTRRYNI 

NIYAPTLLVFKEHINRPA0V1QARGMKKQ1IDDFI 
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NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCystefne, D=Aspartic Add, 
E>GI uremic Add, ^Phenylalanine, G=Glycine, H-Histidine, 
I=Isoleudne, K° Lysine, L^Leudne, M=Mcthionine, 
N=Asparagine, P^Proline, Q-Glutamine, R^Arginine, S=Serine, 
T=Threonine, V-Voline, W-Tryptophan, Y«Tyrosine, 
X n Unknovvn, *^top codon, /^possible nudcotide ddetion, 
^possible nudcotide insertion 








• 


TRNKYLLAARLTSQKLFHELCPVKRSHRQRKYC 

WLLTAETTKLSKPFEAFLSFALANTQDTVRFVH 

VYSNRQQEFADTLLPDSEAFQGKSAVSILERKNT 

AGRWYKTLEDPWIGSESDKFILLGYLDQLRKDP 

ALLSSEAVLPDLTDELAPVFLLRWFYSASDYISD 

CWDSIFHNNWXREMMPLLSLEFSALFILFGTVIVQ 

AFSDSNDERES SPPEKEEAQEKTGKTEPSFTKENS 

SKIPKKGFVEVTELTDVTYTSWLVRLRPGHMNV 

VLILSNSTKTSLLQKFALEVYTFTGSSCLHFSFLSL 

DKHREWLEYLLEFAQDAAPIPNQYDKHFMERDY 

TGYVLALNGHKKYFCLFKPQKTVEEGGKP*GSC 

SDVDSSLYLGESRGKPSCGLGSRPDCGKLSKLSL 

WMERLLEGSLQRFYTPSWPELD 


2969 

• 


A 


48 


1117 


KGl^PDQVl^AFAPlJDCEMWLiCVFTTFLSFATG 

ACSGLKVTVPSHTVHGVRGQALYLPVHYGFHTP 

ASDIQHWIJFERPHTMPKYLLGSVNKSVVPD/YGI 

P/YTS SP*CHPMASLLINPLQFPDEGNYI VKVNIQG 

NG1XSASQKIQVTVDDPVTKPWQIHPPSGAVEY 

VGNMTLTCHVEGGTRLAYQWLKNGRPVHTSST 

YSFSPQNKI1.HIAPVTCEDIGNYSCLVRNPVSEM 

ESDIIMPIIYYGPYGLQVNSDKGLKVGEVFTVDL 

GEAILFDCSADSHPPNTYSWIRRTDNTTYIIKHGP 

RLEVASEKVAQKTMDYVCCAYNNITGRQDETHF 

TVnTSVGMCDIQGRDPNKT 


2970 


A 


68 


936 


HSALLTHSSFCVFTLCQDFFTYSSMSEEVTYADL 

QFQNSSEMEIGPEIGKFGEKAPPAPSHVWRPAAL 

FLTLLCXLLLIGLGVIJ^SMFHVTLKIEMKKMNKL 

QNISEELQRNISLQLMSNMNISNKIRNLSTTLQTI 

ATKLCRELYSKEQEHKCKPCPRRWIWHKDSCYF 

LSDDVQTWQESKMACAAQNASLLKINNKNALE 

FDCSQSRSYDYWLGLSPEEDS/YSWYESG * YNQ\P 

SAWVIRNAPDLNNMYCGYINRLYVQYYHCTYK 

QRMICEKMANPVQLGSTYFREA 


2971 


A 

■ 


912 


2287 


VPNYLPSVSSAIGGEVPQRYVWRFCIGLHSAPRF 

LVAFAYWNHYLSCTSPCSCYKPLCRLNFGLNW 

ENLALLVLTYVSSSEDF/TWVPG*GRSGEVFPEGT 

GLPLPHSDLPTSWCGHSLQCGSQSSFPPAEHENAF 

IVFIASSLGHMLLTCILWRLTKKHTVSQE\DGLSL 

AGAPRQPRRKSRTSVLRIRVMVRWELSSNGNPG 

RGVLGLGLGLGNKLRWGQNLGL*HC VWV V WE 

TGE*KRWRLQMGIE*GVASRRQ*VRNSVRGLVC 

HNSSAPPMYMGFFSPTVFGGGVGG*LHVTFILHP 

PEVEAAGIPLLLGPSLPQRQGREHrVVILAAPACA 

PFHDR* WEPREIRPSP * ELGLRGEPTLS YP A S CR VI 

RQPIP*DRKSYSWKQRLFIINFISFFSALAVYFRHN 

MYCEAG VYTIFA TT ,K YTWLTNMAFHMTA WWD 

FGNKELLITSQPEEKRF 


2972 


A 


1734 


246 


GGILSGRDGRTALPRPREPAERTAGLRRDMRPQE 
LPRLAFPLLLLLLLLLPPPPCPAHSATRFDPTWES 
LDARQLPAWFDQAKF GIFIHWG VFS VP SFGSE WF 
WWYWQKEKIPKYVEFMKDNYPPSFKYEDfGPL 
FTAKFFNANQ\WADIFQ AS GAK YIVLTSKHHEGF 
TLWGXSEYSWKWNAIDEGPKRDIVKELEVAIRNR 
TDLRFGLYYSUT5WFHPLFLEDESSSFHKRQFPVS 
KTLPELYEL VNNYQPE VL WSD G DGG A PD Q YWN 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

tn fir-et amino 
111 111 31 allllUV 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to lost amino 

ttUU I VJIUUE vl 

peptide 
sequence 


Amino acid sequence <A=A!anine 0=Cysteine,D=Asnartic Acid, 
E=Glutarak Add, F=f»henylalanine, G^GIydne, H=Histidlne, 
I»Isoleuclne, K«Lyslne, L^Leudne, M=Methionlne, 
N=Asparagine, P^Proiinc, Q°Glutamine, R-Arginine, S=Serine, 
T^Threonine, V«VaIIne, W=Tryptophan, Y-Tyrosine, 

Ycl TnUnAxvn ***f\tnn ^ndon. /snossible nucleotide deletion 

V=possible nucleotide Insertion 




■ 






STGFLAWLYNESPVRGTVVTNDRWGAGSICKHG 

GFYTCSDRYOTGHIXPHKWENCMTIDKLSWGY 

RREAGISDYLTEEELVKQLVETVSCGGNLLMNIG 

PTLIX3T1SVVFEERLRQMGSWIJCVNGEAIYETHT 

WRSQNDTVTPDVWYTSKPKEKLVYAIFLKWPTS 

GQLFLGHPKA1LGATEVKLLGHGQPLNWISLEQN 

GJMVELPQLTIHQMPCKWGWALALTNVI 


2973 


A 


24 


1133 


SVPRAGGDMETGAAELYDQALLGELQHVGNVQ 

DFLRVLFGFLYRKTDFYRLLRHPSDRMGFPPGAA 

QALVLQVFK1FDHMARQDDEKRRQELEEKIRRK 

EEEEAKTVSAAAAEKEPVPVPVQEIEIDSTTELDG 

HQEVEKVQPPGPVKEMAHGSQEAEAPGAVAGA 

AEVPR\EPPILPRJQEQFQKNPDSYNGAVRENYTW 

SQDYTOLEVRVPVPKHWKGKQVSVALSSSSIRV 

AMLEENGERVLMEGKLTHKINTESSLWSLEPGK 

CVLVNLSKVGEYWWNAELEGEEPID1DKINKERS 

MATVDEEEQAVLDRLTFDYHQKLQGKPQSHEL 

KVHEMLKKGWDAEGSPFRGQRPDPAMFNISPGA 

VQF 


2974 


A 


271 

• 


1854 


MQFGRAHGDCVSGAQLCGCPSMDDYMVLRMIG 

EGSFGRALLVQHESSNQMFAMKEIRLPKSFSNTQ 

NSRKEAXOXAKMKHPNIVAFKESFEAEGHLYIV 

MEYCDGGDLMQKIKQQKGKLFPEDMILNWFTQ 

MCLGVNH1HKKRVLHRDIKSKNIFLTQNGKGKL 

GDFGSARLLSNPMAFACTYVGTPYYVPPEIWEN 

LPYNNKSDIWSLGCILYELCTLKHPFQANSWKNL 

1LKVCQGCISPLPSHYSYELQFLVKQMFKRNPSH 

RPSATTLLSRGIVARLVQKCLPPEIIMEYGEEVLE 

EIiaaSKHhrTPRKKTNPSRIRIALGNEASTVQEEEQ 

DRKGSHTDLESINENLVESALRRVNREEKGNKSV 

HLRKASSF^HRRQWEKNVPNTALTAJLENASILT 

SSLTAEDDRGGSVnCYSKN'lTRKQWLKETPDTLL 

NILKNADLSLAFQTYTIYRPGSVEGFLKGPLSEETE 

ASDSVDGGHDSVILDPERLEPGLDEEDTDFEEED 

DNPDWVSELKKRAGWQGLCDR 


2975 

• 

• 


A 

• 


32 


2833 


PPGEPGAGRGALSPCGPLSGPPPLPGREAGGTCG 

QPVNPVFDLSRRNPQEDFELIQRIGSGTYGDVYK 

ARNVNTGELAAIKVIKLEPGEDFAWQQEIIMMK 

D\CKHP\DIVAYF\GSYL\RRDKLWI\CMEF\CGSGS 

\LQDIYHVTGPLSELQIAYVSRETLQGLYYLHSKG 

KMHRDIKGANILLTDNGHVKLADFGVSAQITATI 

AKRKSFIGTPYWMAPEVAAVERKGGYNQLCDL 

WAVGITAIELAELQPPMFDLHPMRALFLMTKSNF 

QPPKLKDKMKWSNSFHHFVKMALTKNPKKRPT 

AEKLXQHPFVTQHLTRJSLAIELLDKVNNPDHSTY 

HDFDDDDPEPLVAVPHRIHSTSRNVREEKTRSEIT 

FGQVKFDPPLRKETEPHHELPDSDGFLDSSEEIYY 

TARSNLDLQLEYGQGHQG\GYFLGANKSLLKSV 

EEELHQRGHVAHLEDDEGDDDESKHSTLKAK1P 

PPLPPKPKSIFEPQEMHSTEDENQGTDCRCPMSGSP 

\AKPSQVPPRPPPPRLPPHKPVALGNGMS SFQLNG 

ERDGSLCQQQNEHRGENLSRKEKKDVPKPISNG 

LPPTPKVHMGACFSKVFNGCPLKIHCASSWINPD 

TlUDQYLIFGAEEGrYIXNLNELHETSMEQLFPRR 

CTWLYVMNNCLLSISGKASQLYSHNLPGLFDYA 
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SEQ ED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 
1 peptide 
1 sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^Ataoine C=Cystelne, D=>Aspartic Acid, 
E^Glutamic Add, F»Pbeny1alnnine, G<=Gtyclne, H=Histfdine, 
Msoleudne, K«Lysine, L=Leudne, M=M ethionine, 
N-Asparagfne, P-Proline, Q«*Glutamine, RsArginine, S=Serine, 
T«Threonine, V-Valine, W»Tryptophao, Y«Tyroslne, 
X-Un known, *=*Stop codon, /^possible nucleotide deletion, 
Wposslblc nudeotide insertion 


• 








RQMQKLPVAIPAHKLPDRILPRKFSVSAKIPETK 

WCQKCCWRNPYTGHKYLCGALQTSIVLLEWV 

EPMQKFMLIKHIDFPIPCPLKMFEMLVVPEQEYP 

LVCVGVSRGRDFNQVVOU^ETVNPNSTSSWFTES 

DTPQTmrraVTQLERDmVCLIX:CIKIVNLQGR 

LKSSRKLSSELTFDFRIESIVCLQDSVLAPWKHG 

MQGRSFRSNEVTQEISDSTRIFRLLGSDRVWLES 

RPTDNPTANSNLY1LAGHENSY 


2976 


A 


32 


2833 

* 


PPGEPGAGRGALSPCGPLSGPPPLPGREAGGTCG 

QPVNPVFDLSRKNPQEDFELIQRIGSGTYGDVYK 

ARNVOTGELAAIKVIKLEPGEDFAVVQQEIIMMK 

D\CKJHP\DIVAYF\GSYLVRRDKLWI\CMEF\CGSGS 

\LQDIYHVTGPLSELQIAYVSRETLQGLYYLHSKG 

KMHRDIKGANILLTDNGHVKI-ADFGVSAQITATI 

AKRKSFIGTPYWMAPEVAAVERKGGYNQLCDL 

WAVGITAIELAELQPPMFDLHPMRALFLMTKSNF 

QPPKLKDKMKWSNSFHHFVKMALTKNPKraU*T 

AEKLLQHPFVTQHLTRSLAIELLDKVNNPDHSTY 

HDFDDDDPEPLVAVPHRIHSTSRNVREEKTRSEIT 

FGQVKFDPPLRKETEPHHELPDSDGFLDSSEEIYY 

TARSNLDLQLEYGQGHQG\GYFLGANKSLLKSV 

EEELHQRGHVAHLEDDEGDDDESKHSTLKAKIP I 

PPLPPKPKSIFIPQEMHSTEDENQGT1KRCPMSGSP 

\AKPSQVPPRPPPPRLPPHKPVALGNGMSSFQLNG 

ERDGSLCQQQNEHRGENLSRKEKKDVPKPISNG 

LPPTPKVHMGACFSKVFNGCPLKIHCASSWINPD 

TRDQYLIFGAEEGIYTLNLNELHETSMEQLFPRR 

CTWLYVMNNCLLS1SGKASQLYSHNLPGLFDYA 

RQMQKJLPVAIPAHKLPDRILPRKFSVSAKIPETK 

WCQKCCVVRNPYTGHKYLCGALQTSIVLLEWV 

EPMQKFMLIKHIDFPIPCPLKMFEMLVVPEQEYP 

LVCVGVSRGRDFNQWRFETVNPNSTSSWFTES 

DTPQTNVTHVTQLERDTILVCIJDCCIKIVNLQGR 

LKSSRKJL S SELTFDFRIES IV CLQDS VL AF WKHG 

MQGRSFRSNEVTQEISDSTRIFRLLGSDRVWLES 

RPTDNPTANSNLYELAGHENSY 


2977 


A 


174 


1543 

• 


YSLRKG1TFKXAGAMVHIKKGELTQEEKELLEVI 

GKGTVQEAGTLLSSKNVRVNCLDENGMTPLMH 

AAYKGKLDMCKLLLRHGADVNCHQHEHGYTA 

LMFAALSGNKDITWVMLEAGAETDVVNSVGRT 

AAQMAAFVGQHDCVTnWIFFPRERLDYYTKPQ 

GLDKJSPKXPPKI^GPLHKnTTTNLHPVKJVMLV 

NENPLLTEEAALNKCYRVMDLICEKCMKQRDM 

NEVLAMKMHYISCIFQKCINFLKDGENKLDTLIK 

SLLKG\RASDGFPVYPEKILRESIRK\FPYCEATLL 

QQLVRSlAPVEIGSDPTAFSVLTQAITGQVOl^vDV 

EFCTTCGEKGASKRCSVCKMVIYCDQTCQKTHW 

FTHKKICKNLKDIYEKQQLEAAKEKRQEENHGK 

LDVNSNCVNEEQPEAEVGISQKDSNPEDSGEGK 

KESLESEAELEGLQDAPAGPQVSEE 


2978 


A 


3 

• 


5177 


SDDLRTGIJQDVQDAESIJKLPGVYEVLFYNETE 
DCTGMMLWRYPEPRGLTLVRITPVPrWTEDPDI 
STADLGDVLQDPCSLEYWDELQKVFVAFREFNL 
SESKVCELQLPDINLVhnDQKKLVSSDLWRIVLNS 
SQNGADDQSSASESGSQSTOTPLVTPTAIjVACTR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D»Aspartic Acid, 
E=Glutamic Add, ^Phenylalanine, G~Glycine, H=*Histidine, 
I=Isoleudne, K=*Lysine, L/=Lcudne, M=Methionine, 
N^Asparagine, P=Proline, Q=G1utamine, R=Arginine, S=*Serine, 
T=TTireonine, V«Valine, W=^Tryptophan, Y=Tyrosine t 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
V"possibIe nucleotide insertion 


• 

• 




i 

j 




VDSCFTPWFVPSLCVSFQFAHLEFHLCHHLDQLG 

TAAPQYLQPFVSDRNMPSELEYMIVSFREPHMYL 

RQWNNGSVCQEIQFLAQADCKLLECRNVTMQS 

VVKPFSIFGQMAVSSDVVEKIXDCTVIVDSVFVN 

LGQHWHSLMTAIQAWQQNKCPEVEELVFSHFV 

ICNDTQETLRFGQVDTDENILLASLHSHQYSWRS 

HKSPQLLHICIEGWGNWRWSEPFSVDHAGTFIRT 

IQYRGRTASLIIKVQQLNGVQKQIIICGRQIICSYL 

SQSIELKWQHYIGQDGQAVVREHFDCLTAKQK 

LPSYILENNELTELCVKAKGDEDWSRDVCLESK 

APEYSIVIQVPSSNSSIIYVWCTVLTLEPNSQVQQ 

RMIWSPLFIMRSHLPDPIIIHLEKRSLGLSETQIIP 

GKGQEKPLQNIEPDLVHHLTFQAREEYDPSDCA 

VPISTSLIKQIATKVHPGGTVNQILDEFYGPEKSL 

QPIWPYNKKDSDRNEQLSQWDSPMRVKLSIWKP 

YVRTLLIELLPWALLINESKWDLWLFEGEKTVLQ 

WAGKTTTPPNFQEAFQIGIYWANTNTVHKSVAIK 

LVHNLTSPKWKDGGNGEVVTLDEEAFVDTEIRL 

GAFPGHQKLCQFCISSMVQQGIQIIQIEDKTTIINN 

TPYQIFYKPQLSVCNPHSGKEYFRVPDSATFSICP 

GGEQPAMKSSSLPCWDLMPDISQSVLDASLLQK' 

QIMLGFSPAPGADSSQCWSLPAIVRPEFPRQSVA 

VPLGNFRENGFCTRATVLTYQEHLGVTYLTLSED 

PSPRVIIHNRCPVKMLIKENIKDIPKFEVYCKKIPS 

ECSIHHELYHQISSYPDCKTKDLLPSLLLRVEPLD 

EVTTEWSDAIDINSQGTQVVFLTGFGYVYVDW 

HQCGTVTITVAPEGKAGPILTNTNRAPEKIVTF/K 

MFITQLSLAVFDDLTHHKASAELLRLTLDNIFLC 

VAPGAGPLPGEEPVAALFELYCVEICCGDLQLDN 

QLYNKSNFHFAVLVCQGEKAEPIQCSKMQSLLIS 

NKELEE YKEKCFIKLC I TLNEGKSILCDINEFSFEL 

KPARLYVEDTFVYYIKTXFDTYLPNSRLAGHSTH 

LSGGKQVLPMQVTQHARALVNPVKLRKLVIQPV 

NLLVSIHASLKLYIASDHTPLSFSVFERGPIFTTAR 

QLVHALAMHYAAGALFRAGWVVGSLDILGSPA 

SLVRSIGNGVADFFRLPYEGLTRGPGAFVSGVSR 

GTTSFVKHISKGTLTSITNLATSLARNMDRLSLDE 

EHYNRQEEWRRQLPESLGEGLRQGLSRLGISLLG 

AIAGIVDQPMQNFQKTSEAQASAGHKAKGVISG 

VGKGIMGVFTKPIGGAAELVSQTGYGILHGAGLS 

QLPKQRHQPSDWHADQAPNSHVKYVWKMLQS 

LGRPEVHMALDWLVRGSGQEHEGCLLLTSEVL 

FWSVSEDTQQQAFPVTEIDCAQDSKQNNLLTV 

QLKQPRVACDVEVDGVRERLSEQQYNRLVDYIT 

KTSCHLAPSCSSMQIPCPVVAAEPPPSTVKTYHY 

LVDFxir Avj V r LoJvr 1 M V JsJN JsJ\LrKJ\Ajr i* 


2979 


A 


255 


2673 


AWLFPASVLCPRCLTGSAVGSAEWKSLWLFPFS 

SRPTLGHLDSKPS SKSNMIRGRNS ATSADEQPHIG 

NYRLLKTIGKGNFAKVKIJ^RHILTGKEVAVKIID 

KTQLNSSSLQKLFREVRIMKVLhrHPNIVKLFEVIE 

TEKTLYLVMEYASGGEVFDYLVAHGRMKEKEA 

RAKFRQIVSAVQYCHQKF1VHRDLKAENLLLDA 

DMNIKIADFGFSNEFTFGNKLDTFCGSPPYAAPEL 

FQGKKYDGPEVDVWSLGVELYTLVSGSLPFDGQ 

KLKELRERVLRGKYRIPFYMSTDCEN1XKKFLIL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 
i acid residue of 
j peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A«Alanine OCysteine, D=Aspartic Acid, 
EoGlutamic Acid, ^Phenylalanine, G=Glycine, H<=Histiditje, 
I-Isoleudne, K=Lyslne, L^Lencine, M=Methionine, 
N*Asparagine, P-Proline, Q=Glutamlne, R«Arglnine, S=Serioe, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X»Unknown, *=Stop codoo, ^possible nucleotide deletion, 
V"possible nucleotide Insertion 


■ 


* 






1^KRGTLEQ1MKDRWMNVGHE\DDELKPYGEP 

LrADYKDPRRTELMVSMGYTREEIQDSLVGQRYN 

EVMATYLLLGYKSSELEGDTITLKPRPSADLTNS 

SAPSPSHKVQRSVSANPKQRRFSDQAGPAIPTSNS 

YSKKTQSNNAENKRPEEDRESGRKASSTAKVPA 

SPLPGLERKKTTPTPSTNSVLSTSTNRSRNSPLLVE 

RASLNGQGFHPEWAKTALTMPGSRASTASASAA 

VSAARPRQHQKSMSASVHPNKASGLPPTESNCE 

VPRPRQVCWGSCTAPQRVPVASPSAHNISSSGGA 

PDRTNFPRGVSSRSTFHAGQLRQVR\DQQNLPYG 

VTPASPSGHSQGRRGASGSIFSKFTSKFVRRNLNE 

PESKDRWETLRPHVWNSGGNDKEKEEFREAKPR 

SLRFTWSMKTTSSMEPNEMMREIRKVLDANSCQ 

SELHEKYMLLCMHGTPGHEDFVQWEMEVCKLP 

RLSLNGVRFKRISGTOMAFKNIASKIANELKL 


2980 


A 

• 


120 


3433 

• 


NCLLLQAKGFHGEIEDLQQWLTDTERHLLASKP 

LGGLPETAKEQLNVHMEVCAAFEAKEETYKSLM 

QKGQQMLARCPKSAETNIDQDINNLKEKWESVE 

TKLNERVKTVKLEEALNLA\MEFHNSL\QDFINWLT 

QAEQTLNVASRPSLILDTVLFQIDEHKVFANEVN 

SHREQ1IELDKTGTHLKYFSQKQDVVLIKNLLISV 

QSRWEKVVQRLVERGRSLDDARKRAKQFHEAW 

SKLMEWLEESEKSLDSELEIANDPDKIKTQLAQH 

KEFQKSLGAKHSVYDTTNRTGRSLKEKTSLADD 

NLKLDDMLSELRDKWDTICGKSVERQNKLEEAX 

LLFSGQFTDALQALIDWLYRVEPQLAEDQPVHG 

DIDLVMNLIDNHKAFQKELGKRTSSVQALKRSA 

RELIEGSRDDSSWVKVQMQELSTRWETVCALSIS 

KQTRLEAALRQAEEFHSWHALLEWLAEAEQTL 

RFHGVLPDDEDALRTLD^HKEFMKKLEEKRAE 

L>nCATTMGDTVl^ICHPDSITTIKHWrriIRARFEE 

VLAWAKQHQQRLASALAGLIAKQELLEALLAW 

LQWAETTLTDKDKEVIPQEIEEVKALIAEHQTFM 

EEMTRKQPDVDKVTKTYKRRAADPSSLQSH1PV 

LDKGRAGRKRFPASSLYPSGSQTQIETKNPRVNL 

LVSKWQQVWLLALERRRKLNDALDRLEELREF 

ANFDFDIWRKKYMRWMNHKKSRVMDFFRRIDK 

DQDGKITRQEFIDGILSSKFPTSRLEMSAVADIFD 

RDGDGYIDYYEFVAALHPNKDAYKPITDADKIE 

DEVTRQVAKCKCAKRFQVEQIGDNKYRFFLGNQ 

FGDSQQLRLVRILRSTVMVRVGGGWMALDEFL 

VKNDPCRAKGRTNMELREKFILADGASQGMAA 

FRPRGRRSRPSSRGASPNRSTSVSSQAAQAASPQ 

VPA 1" 1 ' 1PKILHPLTRNYGKP WLTNSKMSTPCKAA 

ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGE 

Da CxLl IT AAAKVRTQFADSKKTPSRPGSRAGSKA 

GSRASSRRGSDASDFDISEIQSVC^DVETVPQTHR 

PTPRAGSRPSTAKPSK1F1PQRKSPASKLDKSSKR 


2981 


A 


120 


3433 


NCXLLQAKGFHGEIEDLQQWLTDTERHLLASKP 

LGGLPETAKEQLNVHMEVCAAFEAKEETYKSLM 

QKGQ QMLARCPK5 AETNIDQDINNLKEK WESVE 

TKI.NER\KTVKLEEALNLA\MEFHNSL\QDFIh^ 

QAEQTLNVASRPSLELDTVLFQIDEHKVFANEVN 

SHREQI1ELDKTGTHLKYFSQKQDVVLIKNLLISV 

QSRWEKVVQRLVERGRSLDDARKRAKQFHEAW 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D^Aspartic Acid, 
E^Glutamic Acid, F-Phcnylalanine, G^GIycine, H=Histidinc f 
1=1 so leucine, K«Lysine, I/*Leudne, M**Methionine, 
N-Asparagine, P^Proline, Q=*Glutamine, R^Arginine, S=Serine t 
T^Threonine, V»Valine f W^Tryptophan, Y-Tymsine, 
X-Unknown, *-Stop codon, ^possible nudeotide deletion, 
^possible nudeotide insertion 










SKLMEWLEESEKSLDSELEIANDPDKIKTQLAQH 

KEFQKSLGAKHS\ryi>TTNRTGRSLKEKTSLADD 

NLKLDDMLSELRDKWDTICGKSVERQNKLEEA\ 

LLFSGQFTDALQALIDWLYRVEPQLAEDQPVHG 

DIDLVMNLIDNHKAFQKELGKRTSSVQALKRSA ' 

REL1EG SRDDSS WVKVQMQELSTRWETVC ALSIS 

KQTEU.EAALRQAEEFHSWHALLEWLAEAEQTL 

RFHGVLPDDEDALRTLIDQHKEFMKKLEEKRAE 

LNKATTMGDTVLAICHPDSrrTIKIIWITIIRARFE^ 

VLAWAKQHQQRLASALAGLIAKQELLEALLAW 

LQWAETTLTDKDKEVIPQEIEEVKALIAEHQTFM 

EEMTRKQPDVDKVTKTYKRRAADPSSLQSHIPV 

LDKGRAGRKRFPASSLYPSGSQTQIETKNPRVNL 

LVSKWQQVWLLALERRRKLNDALDRLEELREF 

ANFDFDIWRKJKYMRWMNffiCKSRVMDFFRRIDK 

DQDGKITRQEFIDGILSSKFPTSRLEMSAVADIFD 

RDGDGYIDYYEFVAALHPNKDAYKPITDADKIE 

DEVTRQVAKCKCAKRFQVEQIGDNKYRFFLONQ 

FGDSQQLRI.VRILRSTVMVRVGGGWMALDEFL 

VKNDPCRAKGRTOMELREKFILAIXjASQGMAA 

FRPRGRRSRPSSRGASPNRSTSVSSQAAQAASPQ 

VPArriPKILHPLTRNYGKPWLTNSKMSTPCKAA 

ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGE 

DSGUTTAAARVRTQFADSKKTPSRPGSRAGSKA 

GSRASSRRGSDASDFDISEIQSVCSDVETVPQTHR 

PTPRAGSRPSTAKPSKIPTPQRKSPASBCLDKSSKR 


2982 


A 


1 

V 

• 


2065 


MAAGGAEGGSGPGAAMGDCAEIKSQFRTREGF ! 

YKLLPGDGAARRSGPASAQTPVPPQPPQPPPGPA 

SASGPGAAGPASSPPPAGPGPGPALPAVRLSLVR 

LGEPDSAGAGEPPATPAGLG SGGDRVCFNLGRE 

LYFYPGCCRRGSQRWHTPLTPFLPPLKSIDLNKPI 

DKRIYKGTQPTCHDFNQFTAATETISLLVGFSAG 

QVQYLDLIKKDTSKLFNEERLIDKTKVTYLKWLP ! 

ESESLFLASHASGHLYLYNVSHPCASAPPQYSLL [ 

KQVAWGFSFYAAKSKAPRNPLAKWAVGEGPLNE 

FAFSPDGRHLACVSQDGCLRVFHFDSMLLRGLM 

KSYFGGLLCVCWSPDGRYVVTGGEDDLVTVWS 

FTEGRWARGHGHKS WVNA VAFDPV iTKAEEA 

ATAAGADGERSGEEEEEEPEAAGTGSAGGAPLSP 

LPKAGSITYRFGSAGQDTQFCLWDLTEDVLYPHP 

PLARTRTLPGTPGTTPPAASSSRGGEPGPGPLPRS 

LSRSNSLPHPAGGGKAGGPGVAAEPGTPFSIGRF 

ATLTLQERRDRGAEKEHKRYHSLGNISRGGSGG 

SGSGGEKPSGPVPRSRLDPAKVLGTALCPRIHEV 

PLLEPLVCKKIAQERLTVLLFLEDCnTACQEGLIC 

TWARPGlCAFTDEETEAQTGEGSWPKSPbKSVVE 

GISSQPGNSPSGTW 


2983 


A 


385S 


220 


RRFRJLS AHRA QPCCRCRGLEMPRG VFQ QLSNL V 

LQEI^IANLSNLTSAFEKATAEKIKCQQEADATN 

RV1LLANRLVGGLA SENIRWAES VENFRS QG VTL 

CGDVLLISAFVSYVGYinTCKYRNELMEKFWIPYI 

HNLKWIP1TNGLDPLSLLTDDADVATWNNQGLP 

SDRMSTENATDLGNTERWPLIVDAQLQGIKWIICN 

KYRSELKA1RLGQKSYLDVIEQATSEGDTLLIENI 

GETVDPALDPIXGR>ITIKKGKYIKIGDKEVGVPP 
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SEQID 
NO: 


Method 


j Predicted 

beginning 

nucleotide 

location 
I corresponding 
1 to first amino 

acid residue of 

peptide 
1 sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 


Amino acid sequence (A»Alanlne OCysteioe, D=Aspartic Add, 
EXJIutnmic Acid, F=Pheny1 alanine, G=Glycine, H^Hlstidlne, 
I^Isoleucine, K^Lysine, U=Leucine, M s Methionlne, 
N**Asparagine, P-Proiine, Q=Clutamine, R«Arginine, S=Scrine, 
T=Threonine, V-Valine, W-Tryptophan, Y»Tyrosine, 
X°Unknown, *~Stop codon, /"-possible nucleotide deletion, 
^possible nucleotide Insertion 










Q VPPDPTOQ VLQPTLQARDAG SVHXLENFLVTRD 

GLEDQLLAAWAKERPDLEQLKANLTKSQNEFK 

I VLKELEDSLLARLS AA SGNFLGDTAL VENLETT 

KHTASEIEEKWEAKJTEVKINEAREKm > AAER 

ASLLYFIL>TOLhQCIWyYQFSLKAFNVVFEKAIQR 

TTPAOTVOCQRVINLTDEITYSVYMYTARGLFERD 

KLIFLAQVTFQVLSMKKELNPVELDFLLRFPFKA 

GVVSPVDFLQHQGWGGIKALSEMDEFKNLDSDI 

EGSAKRWKKLVESEAPEKEIFPKEWKNKTALQK 

LClVrVRCLRPDRMTYAIKNFVEEKMGSKFVEGRS 

VEFSKSYEESSPSTSIFFILSPGVDPLKDVEALGKK 

LGFTBDNGKLHNVSLGQGQEWAENALDVAAEK 

GHWV1LQNIHLVARWLGTLDKKLERYSTGRHED 

YRVFIRAEPAPSPETH1IPQGILENAIKITNEPPTGM 

YANLYKALDLFTQDTLEMCTKEMEFKCMLFAL 

CYFHAVVAERRKFGAQGWNRSYPFNNGDLTISI 

NVLYNYLEANPKVPWDDLRYIJFGEIMYGGHrrD 

DWDRRLCRTYLAEYIRTEMLEGDVLLAPGFQIPP 

NLDYKGYHEY1DENLPPESPYLYGLHPNAEIGFL 

TVTSEKLFRTVLEMQPKETDSGAGTGVSREEKV 

KAVLDDILEKIPETFNMAEIMAKAAEKTPYVVV 

AFQECERMNILTNEMRRSLKELNLGLKGELTITT 

DVEDLSTALFYDTVPDTWVARAYPSMMGLAAW 

YANLLLRIRE1JEAWTTDFALPTTVWLAGFFNPQS 

FLTAIMQSMARKNEWPLDKMCLSVEVTKKNRE 

DMTAPPREG S YVYGLFMEGARWDTQTGVIAEA 

RLKELTPAMPVIFTKADPVARMETKNIYECPVYKT 

RIRGPTYVWTFNLKTKEKAAKWILAAVALLLQV 


2984 


A 


2 


1464 


FVLFPGIAMETPGASASSLLJLPAASRPPRKREAGE 

AGAATSKQRVLDEEEYIEGLQTVIQRDFFPDVEK 

LQAQKEYLEAEENGDLERMRQIAIKFGSALGKM 

SREPPPPYVTPATFETPEVHAGTGWGNKPRPRG 

RGLEDGEAGEEEEKEPLPSLDVFLSRYTSEDNAS 

FQEIMEVAKERSRARHAWLYQAEEEFEKRQKDN 

LELPSAEHQAIESSQASVETWKYKAKNSLMYYP 

EGVPDEEQLFKKPRQWHKNTRFLRDPFSQALSR 

CQLQQAAALNAQHKQGKVGPDGKELIPQESPRV 

GGFGFVATPSPAPGVNESPMMTWGEVENTPLRV 

EGSETPYVDRTPGPAFKTT .EPGRRERLGLKMANE 

AAAKNRAKKQEALRRVTENLA SLTPKGLSPAMS 

PALQRL VSRTASKYTDRALRA S YTPSPARSTHLK 

NPGPVGCRPPQSTPGA/PGSATRTPL\TQDPA\S1T 

DNLLQLPARRKASDFF 


2985 


A 


1890 

• 


178 

• 


ASTQEAGLLSPPGVGAQRCWNFVACLPVRACAD 

MASNDYTQQATQSYGAYPTQPGQGYSQQSSQP 

YGQQSYSGYSQSTDTSGYGQSSYSSYGQSQNSY 

GTQSTPQGYGSTGGYGSSQSSQSSYGQQSSYPGY 

GQQPAPSSTSGSYGSSSQSSSYGQPQSGSYSQQPS 

YGGQQQSYGQQQSYNPPRGYGQQNQYNSSSGG 

GGGGGGGGSYGQDQSSMSGSGGGGGGGGGGGS 

GGGGGYGNQDQTGAAGSRGYRQXQDRGGRCRG 

GSGGGGSXGGAAGYNRSSGGYEPRGRGGGRGGR 

GGMGGSDRGGFNKFGGPRDQGSRHDSEQDNSD 

NNTIFVQGLGENVniESVADYFKQIGIIKTNKKTG 

QPMINLYTDRETGKLKGEATVSFDDPPSAKAAID 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

to first amino 
add residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 

tn loct Dtnlnn 
UJ lasi Hiuiuu 

add residue or 

peptide 

sequence 


Amino add sequence (A»Alanine OCystdne, D= As par tic Add, 
EeGlutamlc Add, (^Phenylalanine, G<=GIyclne, H=Histidine t 
I=Isoleucine, KpLysine, L=L*udne, M=Methtonine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Strinc, 
j a i nreoDinc, v^vaiinci w— i rypiopnan, i 8 iyrosinc, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










WFDGKEFSGNPIKVSFATRRADFNRGGGNGRGG 

RGRGGPMGRGG YGGGGSGGGGRGGFPSGG GGG 

GGQQRAGDWKCPNPTCENMNFSWRNECNQCK 

APKPDGPGGGPGGSHMGGNYGDDRRGGRGGYD 

RGGYRGRGGDRGGFRGGRGGGDRGGFGPGKM 

DSRGEHRQDRRERPY 


2986 

p 

1 « 


A 

■ 


1890 


178 

• 


ASTQEAGLLSPPGVGAQRCWNFVACLPVRACAD 

MASNDYTQQATQSYGAYPTQPGQGYSQQSSQP 

YGQQSYSGYSQSTDTSGYGQSSYSSYGQSQNSY 

GTQSTPQGYGSTGGYGSSQSSQSSYGQQSSYPGY 

GQQPAPSSTSGSYGSSSQSSSYGQPQSGSYSQQPS 

YGGQQQSYGQQQSYNPPRGYGQQNQYNSSSGG 

GGGGGGGGSYGQDQSSMSGSGGGGGGGGGGGS 

GGGGGYGNQDQTGAAGSRGYRQVQDRGGRCRG 

GSGGGGS\GGAAGYNRSSGGYEPRGRGGGRGGR 

GGMGGSDRGGFNKFGGPRDQGSRHDSEQDNSD 

NNl'lFVC^LGEKSTTIESVADYFKQIGIDCTNKKTG 

QPMIKLYTDRETGKLKGEATVSFDDPPSAKAAID 

WFDGKEFSGNPIKVSFATRRADFNRGGGNGRGG 

RGRGGPMGRGGYGGGGSGGGGRGGFPSGGGGG 

GGQQRAGDWKCPNPTCENMNFSWRNECNQCK 

APKPDGPGGGPGGSHMGGNYGDDRRGGRGGYD 

RGGYRGRGGDRGGFRGGRGGGDRGGFGPGKM 

DSRGEHRQDRRERPY 


2987 


A 


1376 


898 


GGAKAGGAPHPFTLPFRHVGGLSAAPEEVEGML 

WAGARQHGRNWRKJfcETSPGTQGPLPPVPR/VPP 

GPDGVPHAIAPTLSWAIPRQQCSPQPGRLNALPPD 

RCSGPHFGDRAPESCFPGACSVSGACAFKGTRPA 

CPPQEPSLRSSRNRLREGQTFGRMEI 


2988 

• 


A 


1 


1011 


MGNDSVSYEYGDYSDLSDRPVDCLDGACLAEDP 

LRVAPLPLYAA1FLVGVPGNAMVAWVAGKVAR 

RRVGATWLLHLAVADLLCCLSLPILAVPIARGGH 

WPYGAVGCRALPSIILLTMYASVLLLAALSADLC 

FLALGPAW\CLRFS/GACGVQVACGAAWTLALL 

LTVPSAIYRRLHQEHFPARLQC WDYGGS SSTEN 

AVTAIRFLFGFLGPLVAVASCHSALLCWAARRC 

RPLGTAIWGFFVCWAPYHLLGLVLTVAAPNSA 

LL ARALRAEPLIVG LAJL AHS CLNPMLFL YFGRAQ 

LRRSLPAACHWALRESQGQDESVDSKKSTSHDL 

VSEMEV 


2989 


A 


27 


4074 

• 


KSQLFCFWVGKAGDILSGDQDKEQKDPYFVETP 

YGYQLDLDFLKYVDDIQKGNTIKRLNIQKRRKPS 

VPCPEPRTTSGQQGIWTSTESLSSSNSDDNKQCP 

NFLIARSQVTSTPISKPPPPLETSLPFLTIPENRQLP 

PPSPQLPKHNLHVTKTLMK 1 RRRLEQERATMQM 

TrOlirRKPKX-ASr GGMG 1 1 SSLPSFVGSGInHNPA 

KHQLQNGYQGNGDYGSYAPAAPTTSSMGSSIRH 

SPI^SGISTPVTNVSPMHLQHIREQMAIALKRLKE 

LEEQVRTIPVLQVKISVLQEEKRQLVSQLKNQRA 

ASQINVCGVRKRSYSAGNASQLEQLSRARRSGG 

ELYIDYEEEEMETVEQSTQRKEFRQLNTADMQA 

LEQKIQDSSCEASSELRENGECRSVAVGAEENMN 

DIWYHRG SRSCKD AA V GTL VEMRNCG VS VTEA 

MLGVMTEADKEffiLQQQTIESLKEKIYRLEVQLR 

ETTHDREMTKLKQELQAAGSRKKVDKATMAQP 
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SEQW 
NO: 


MetEocF 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanfnc OCysteine, D=Aspartic Acid, 
E=Glutamlc Add y ^Phenylalanine, G=Gtycine, H-Hlstidine, 
I»Iso!eucine, K~Lysine, L=Lcucine, M-Methionlne, 
N=Asparagine, P=Proline, Q=Glutamlne, R°Argininc, S=Serine T 
T»Threonine, V=Valine, W=Tryptophan f Y«Tyrosine, 
X=Unkoown, *«Stop codon, A=possible nucleotide deletion, 
^possible nucleotide insertion 








* 


LVFSKWEAVVQTRDQMVGSHMDLVDTCVGTS " 

\^TNSVGISCQPECKNKVVGPELPMNWWIVKER 

VEMHDRCAGRS VEMCDKS VS VEVS VCETG SNTE 

ESVNDLTX.LKTNLNLKEVRSIGCGDCSVDVTVCS 

PKECASRGVNTEAVSQVEAAVMAVPRTADQDT 

STOLEQVHQFTN TK1 ATLffiSCTNTCLSTLDKQTS 

TQTVETRTVAVGEGRVKX)INSSTKTRSIGVGTLL 

SGHSGrorU>SAVKTKESGVGQININDNYLVGLK 

MRTIACGPPQLTVGLTASRRSVGVGDDPVGESLE 

NPQPQAPLGMMTGLDHVTERIQKLLAEQQTLLA 

ENYSELAEAFGEPHSQMG SLNSQLISTLSSINS VM 

KSASTEELRNPDFQKTSLGKITGSYLGYTCKCGG 

LQSGSPLSSQTSQPEQEVGTSEGKP1SSLDAFPTQ 

EGTLSPVNLTDDQIAAGLYACITnINESTLKSIMKK 
KDGNKDSNG AKKKLQFVGING G YETTSSDDS S S 

DESSSSESDDECDVIEYPLEEEEEEEDEDTRGMAE 

GHHAVNDBGLKSARVEDEMQVQECEPEKVEIRE 

RYEI^EKMLSACNLLKNTINDPKAJLTSKDMRFC 

LNTLQHEWFRVSSQKSAIPAMVGDY1AAFEAISP 

DVLR YVINLADGNGNTALHYS V SHSNFEIVKLLL 

DADVC^VDHQNKAGYTPIMLAALAAVEAEKDM 

RIVEELFGCGDVNAKASQAGQTALMLAVSHGR1 

DMVKGLLACGADVNIQDDEGSTALMCASEHGH 

VEIVKLLLAQPGCNGHLEDNDGSTALSIALEAGH 

KDIAVLLYAHVNFAKAQSPGTPRLGRKTSPGPTH 

RGSFD 


2990 


A 


69 

• 


1687 


ERLRPGQRAIRGPVPAAGACASLPPRAGPAQGRH 
AALGGAEPGSHLHCG VRLQRREEPG GQQRLLPQ 
RGGSAQTGHQHPGPYECQCPGPQPGGTTPALLSL 
ILEETRGPPASANPDKJDHSTQPGTMGRKKIQISRI 
LDQRNRQV 1 K1KRKFGLMKKAYELS VLCDCEIA 
LHFNSATRLFQYASTDMDRVLLKYTEYSEPHESR 
TOTDBLETLKRRGIGLDGPELEPDEGPEEPGEKFR 
RLAGEGGDP ALPRPRL YPAAPAMPS PD V VYGAL 

PPPGXCDPSGLGEALPAQSRPSPFRPAAPKAGPPG 

LGHPLFSPSHLTSKTPPPLYLPTEGRRSDLPGGLA 

GPRGGLNTSRSLYSGLQNPCSTATPGPPLGSFPFL 

PGGPPVOAEAWARRVPQPAAPPRRPPQSSIKSER 

LFLRPPGAPATFLRPSPIPCSSPGPWQSLCGLGPPA 

CAGCPWPTAGPGRRSPGGTSPERSPGTARARGDP 

\TSIX3AFSEKTHTVTAPLRGGGLEVGGWTQSSAG 

GLLSFFLFVCISTNKNARGVRGPEKK 


2991 


A 


3 


1159 


IPQPLHCASPKEEMSLRCGDAARTLGPRVFGRYF 
CSPVRPLSSLPDKKKELLQNGPDLQDFVSGDLAD 
RSTWDEYKGNLKRQKGERLRLPPWLKTEEPMGK 
W i NJvLlsJN l UtvN LfNl/H I V CEEAKCPNIGEC WGGG 

EYATATATIMLMGDTCTRGCRFCSVKTARNPPP 

LDASEPYNTAKAIAEWGLDYVVLTSVDRDDMP 

DGGAEH1AKTVSYLKERNPKILVECLTPDFRGDL 

KAIEKVALSGLDVYAHNVETVPELQSKVRDPRA 

NFDQSLRVIJCHAKKVQPDVISKTSIMLGLGENDE 

QVYATMKAIJ^JEADVDCLTLGQYMQPTORHLKV 

EEYITPEKFKYWEKVGNELGFHYTASGPVLVRSS 

YKAGEFFLKNLVAKRKTKDL 


2992 | A 


3 


1636 


PVPGVPTSPPSCCPQDMQGPWVLLLLGLRLQLSL 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 


Amino add sequence (A-Alanine OCysteine, D-Aspartk Add, 
eXSIutamlc Add, F»Phenylalanioe, OGlydne, H«Histidinc, 
I=Isoleucine, K= Lysine, LHLcudne, M=Methioninc, 
N»Asparagine, P^ProIine, Q=C!utflmine, R»Arginine, S*£ertae, 
T=a*hreonlne, V«Valine, W«Tryptophan, Y»Tyrosine, 
X^Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 










GV1PAEEENPAFWNRQAAEALDAAKKLQP1QKV 

AKNLILFLGDGLGVPTVTATRILKGQKNOKLGPE 

TPLAMDRPPYLALSKTYNVDRQVPDSAATATAY 

LCGVKANFQTIGLSAAARFNQCNTTRGNEVISV 

MNRAKQAGKSVGVVrriKVQHASPAGTYAHTV 

NRNWYSDADMPASARQEGCQDIATQLISNMDID 

VILGGGRKYMFPMGTPDPEYPADASQNGIRLDG 

Khn^VQEWLAKH(^AWYVWNRTELMQASLDQS 

VTHLMGLFEPGDTKYEIHRDPTLDPSLMEMTEA 

ALRLLSKNTPRGFYLFVEGGRIDHGHHEGVAYQA 

LTEAVMFDDA1ERAGQLTSEEDTLTLVTADHSH 

VFSFGG YTLRG SSIFGLAPSKAQDSKAYTSIL YGN 

GPG YVFNSGVRPDVNESESG SPDYHQQAGWPLS 

SETHGGEDVAVFARGPQAHLVHGVQEQSFVAH 

VMAFAACLEPYTACDLAPPACTTDAAHPVAASL 

PLLAGTLLLLGASAAP 


2993 


A 


3 


685 

• 


DAWARLLKMNRLFGKAKPKAPPPSLTDCIGTVD 

SRAESrDKKISRLDAELVKYKDQIKKMREGPAKN 

MVKQKALRVLKQKRMYEQQRDNLA\NSHSTW\ 

TSXHYTIQSLKDTKTTVDAMKLGVKEMKKAYKQ 

VKIDQIEDLQDQLEDMMEDANEIQEALSRSYGTP 

ELDEDDLEAELDALGDELLADEDSSYLDEAASA 

PATPEGVPTDTKNKDGVLVDEFGLPQ1PAS 


2994 


A 


1710 


161 


RRCELTPFIIKTLILPKSWGAFPEDWMQHVSSSQ 

SSQRHVQWPGACPGAGEEQPACSQPSLPLTLPSP 

SHQLQQLMVRGGPAGGQNNC^VDLQGVGPGLQ 

GSPQVTLAPLPLPSPTSPGFQFSAQPRRFEHGSPS 

Y1QVTSPLSQQVQTQSPTQPSPGPGQALQNVRAG 

APGPGLGLCSSSPTGDFVDASVLVRQISLSPSSGG 

HFVFQDGSGLTQIAQGAQVQLQHPGTPITVRERR 

PSQPHTQSGGTIHHLGPQSPAAAGGAGLQPLASP 

SHITTANLPPQISSnQGQLVQQQQVLQGPPLPRPL 

GFERTPGVLLPGAGGAAGFGMTSPPPPTSPSRTA 

VPPGLSSLPLTSVGNTGMKKVPKKLEEIPPASPE 

MAQMRKQCLDYHHQEMQALKEVFKEYLIELFF 

LQHFQGNMMDFLAFKERLYGPLQAYLRQNDLD1 

EEEEEE\HPEVINDEVXVVARKHGQPGTPVAIAT\ 

QLPPRTSAAFPAQQQPLQVLSDGSTVQLPRLSSL 

GFEDSMC 




A 

A 


3 


924 


SAPSGIDASTHAFARCKHPINVRRDPSIPIYGLRQS 

ILLNTRLQDCYVDSPALTNIWMARTCAKQNINAP 

APATTSSWEVVRWLIASSFSL\nKLVLRRQUCNK 

CCPPPCKFGEGKI^KRLKHKDDSVMKATQQARK 

RNFISSKSKQPAGHRRPAGGIRESKESSKEKKLTV 

RQDLEDRYAEHVAATAQALPQD SGTAA WKGNRV 

L*L,rr*l V^JsJvv^yl-o JtSLJ 1 JL I IriljJL>r 1 iiO Y v£AL> Y HA V V 

EPMLWNPS GTPKRYSLELGKAIKQKLWE ALCS Q 
GAISEGAQRDRFPGRKQPGVHEEPVLKKWPKLK 
SKK 


2996 


A 


3 


1713 


GKFGIKPSQRRISGKSTFHSEMEGEDTRDDSLYSI 

LEELWQDAEQIKRCQEKHNK1XSRTTFLNKKILN 

TEWDYEYKDFGKFVHPSPNLILSQKRPHKRDSFG 

KSFKHha-DUmWKSNAAKhOJDKTIGHGQVFTQ 

NSSYSHHENTHTGVKFCERNQCGKVLSLKHSLS 

QNVKFPIGEKANTCTEFGKJOFTQRSHFFAPQKIHT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine 0=Cysteine, B=A8partic Acid, ~~ 
E»Glntamic Acid, ^Phenylalanine, G«Glycine, H=Hlstidlne, 
I=>Isoleucine, K»Lysine, L»Leuctne, M«=Methlonine, 
N=Asparagine, P=Proline, Q=Glutamine, R»Arginine> S=Serine, 
T=Threonine, V«Valine, W=Tryptophan, Y=Tyroslne, 
X^Un known, *=Stop codon, /^possible nucleotide deletion, 
\= possible nucleotide insertion 

• 










VEKPHELSKCVNVFTQKPLLSm^VrDUJEKLYA 
CTKM/CGKGLHPRNSELIMHEKTHTOEKPYKCNE 
\CGKSFFQVSSLLRHQTTHTGEKLFECSECGKGFS 
LNSALNIHQKIHTGERHHKCSECGKAFTQKSTLR 
MHQRIHTGERSYICTQCGQAFIQKAHL1AHQRIH 
TGEKPYECSDCGKSFPSKSQLQMHKRIHTGEKPY 
ICTECGKAFTNRS>n.NTOQKSHTGEKSYICAECG 
KLAJFTDRSNFNKHQTIHTGEKPYVCADCGRAFIQK 
SELITHQRIHTTEKPYKCPDCEKSFSKKPHLKVHQ 
RIHTGEKPYICAECGKAFTDR5>n r NKHQTIHTGD 
i KPYKCSDCGKGFTQKSVLSMHRNIHT 


2997 


A 


3 


1763 


AASTRTMGSRHFEGIYDHVGHFGRFQRVLYFICA 

FQNISCG IHYLAS WMG VTPHHVCRPPGNVSQVV 

FHNrTSNWSLEDTGALLSSGQKDYVTVQLQNGEI 

WELSRCSRNKRENTSSLGYEYTGSKKEFPCVDG 

YTYT)QOTWKSTAVTQWNLVCDRKWLAMLIQPL 

FMFGGPTGIGAnTFGYFNSDRLGRRVVLWATSSS 

MFLFGIAAAFAVDYYTFMAARFFLAMVASGYLV 

VGFVYVMEFIGMKSRTWASVHLHSFFAVGTLLV 

ALTGYLVRTWWLYQMILSTVTVPFDLCCWVLPE 

TPF WLLSEGRYEEA QKMVDIMAK WNRAS SCKLS 

ELLSLDLQGPVSNSPTEVQKHNLSYLFYNWSITK 

RTLTVWLIWFTGSLGFYSFSLNSVNLGGNEYLNL | 

FLLGWEIPAYTFVCIAMDKVGRRTVLAYSLFOS 

AI^CGVVMVIrXJKHYILGVVTAMVVGKILPIGAA 

FGXLIYLYTAELYPTIVRSLAVGSGSMVCRLASIL 

APFSVDLS3IWIFIPQLFVGTMALLSGVLTLKLPE 

TLGKJU-ATTWEEAAKIJESENESKSSKLLLTTNNS 

GLEKTEAITPRDSGLGE 


2998 


A 


3 


1441 


QRPASQLLAPFAAEALPGAPRAAMAQHFSLAAC 

DWGFDLDHTLCRYNLPESAPLIYNSFAQFLVKE 

KGYDKELLNVTPEDWDFCCKGLALDLEDGNFL 

KLANNGTVLRASHGTKMMTPEVLAEAYGKKEW 

KHFLSDTGMACRSGKYYFYDNYFDLPGALLCAR 

WDYLTKLNNGQKTFDFWKDIVAAIQHNYKMS 

AFKENCGIYFPEIKRDPGRYLHSRPESVKKWLKQ 

LKNAGKILLLITSSHSDYCRLLCA\YILGNDFTDLF 

DIVITNALKPGFFSHLPSQRPFRTLENDEEQEALP 

SLDKPGWYSQGNAVHLYELLKKMTGKPEPKVV 

YFGDSMHSDIFPARHYSNWETVLILEELRGDEGT 

RSQRPEESEPLEKKGKYEGPKAKPLNTSSKKWGS 

FR1DSVLGLENTEDSLV YTWS CKRISTYSHAIPSI 

EAIAELPLDYKFTRFSSSNSKTAGYYPNPPLVLSS 

DETLISK 


2999 

,. 


A 


320 


2417 


LRRRJKMTPQSLLQTTLFLLSLLFLVQGAHGRGHR 

EDFRFCSQRNQTHRSSLHYKPTPDLRISIENSEEA 

LTVHAPFPAAHPASRSFPDPRGLYHFCLYWNRH 

AGRLHLLYGKRDFLLSDKAS SLLCFQHQEESLAQ 

GPPLLATSVTSWWSPQN1SLPSAASFTFSFHSPPH 

TGAHNASVDMCELKRDLQLLSQFLKHPQKASRR 

PSAAPASQQLQSLESKLTSVRFMGDMGSFEEDRI 

NATVWKLQPTAGLQDLHIHSRQEEEQSEIMEYS 

VLLPRTLFQRTKGRSGEAEKRLLLVDFSSQALFQ 

DKNSSQVLGEKVLGIWQNTKVANLTEPVVLTF 

QHQLQPKNVTLQCVFWVEDPTLSSPGHWSSAGC 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= A la nine OCysteine, D=Aspartic Acid, 
E=Glntamlc Add, F«=Pbenylalanine, G»Glycine, H-Histidine, 
I=Iso!eudne, K«Lysine, L^Leucine, M=Methionine, 
N»Asparagine, peprolfne, Q=Glutaraine, R=Arginine, S«Scrinc, 
T=Threonine, V=Valine, W«Tryptophan, Y=Tyrosine t 
X«=Unknown, *«Stop cod on, /^possible nucleotide deletion, 
V=possible nucleotide insertion 


* 






• 


ETVRRETQTSCFCNHLTYFAVLMVSSVEVDAVH 
KHYLSLLSYVGCVVSALACLVTIAAYLCSRVPLP 
CRRKPRDYTIKVHMNLLLAVFLLDTSFLLSEPVA 
LTGSEAGCRASAIFLHFSLLTCLSWMGLEGYNLY 
RLWEWGTYWGYLLKLSAMGWGFP1FLVTLV 
ALVDVDNYGPHLAVHRTPEGVIYPSMCWIRDSL 
- VSYIT^GLFSLVFLFNMAMLATMVVQILRLRPH 
TQKWSHVLTLLCLSLVLG\LPWALIFFSFASGTFQ 
LWLYLFSnTSFQGFLIFIWYWSMRLQARGGPSP 
LKSNSDSARLPISSGSTSSSRI 


3000 


A 


66 


1003 


SRGQLDAGQSSEQHGGNRQPEQSRSRSSSSSSSP 

RRSRSAAEPAMALSMPLNGLKEEDKEPLIELFVK 

AGSDGESIGNCPFSQRLFMILWLKGWFSVn VD 

LKRKPADLQNLAPGTHPPFITFNSEVKTDVNKIEE 

F1.EEVLCPPKYLKLSPKHPESNTAGMDIFAKFSA 

YUCNSRPEANEALERGLLKTLQKLDEYLNSPLPD 

EIDENSMEDIKFSTRKFLDGMEMTLADCNLLPKL 

HIVKVVAJKXYRNFDIPKEMTGIWRYLTNAySRD 

EFTNTCPSDKEVEI^YSDVAKRLHQVKSRLLKE 

VSFMSSP 


3001 

• 


A 


779 


2006 


LALTFRSALSTLPGSPMTSSGSPDLQLAWGPSLLP 

HPPSVWSPALPSCFAGPCPLLPLSDTQGWWGPN 

WLAPPSAALCRPDAAVWPDLPSSNILLVTPPPAK 

*SAVAV*PCPRGAHSLERAARQYTISGSSTSQSGK 

CSKRDTKCCAVTTSWGCFWQKHWKGDEDSGW 

AFQEGSHLGEGHL 


3002 

• 


A 

% 


909 


2799 


VEEAWTVWLHWGVRECLLEEETNQKEEAASSN 

WTKARGPFWQEDWVWDMRLKM'ri'RNFPEREV 

PCDVEVERbl REVPCLSSLGDGWDCENQEGHLR 

QSALTLEKPGTQEAICEYPGFGEHLIASSDLPPSQ 

RVLATNGFHAPDSNVSGLDCDPALPSYPKSYAD 

KRTGDSD ACGKGFNH SME VTHGRNP VREKP YKY 

PESVKSFNHFTSLGHQKIMKRGKKSYEGKNFEN1 

FTLSSSLNENQRNLPGEKQYRCTEC GKCFKRNSS 

LVLHHRTHTGEKPYTCNECGKSFSKNYNLIVHQ 

RIHTGEKPYECSKCGKAFSDGSALTQHQRIHTGE 

KP YECLECGKTFNRN SSLILHQRTHTGEKPYRCN 

ECGKPFTDISHLTVHLRIHTGEKPYECSKCGKAF 

RDGSYLTQHERTHTGEKPFECAECGKSFNRNSHL 

IVHQKJHSGEKPYECKECGKTFIESAYLIRHQRIH 

TGEKP YG CNQCQKLFRNI AGLIRHQRTHTGEKP Y 

EC^QCGKAFRDSSCLTKHQRIHTKETPYQCPECG 

KSFKQNSHLAVHQRLHSREGPSRCPQCGKMFQK 

S SSLVRHQRAHLGEQPMET* WLGAT* VFQFTLTP 

VFRRRVLDLTPLWSVEKNPLSYPVN 


3003 


A 


2 


1489 


SLTEHLSFFQPTAHSLTSLLGTMTTCSRQFTSSSS 

MKGSCGIGGGIGGGSSRISSVLAGGSCRAPSTYG 

GGLSVSSRFSSGGACGLGGGYGGGFSSSSSFGSG 

FGGGYGGGLGAGFGGGLGAGFGGGFAGGDGLL 

VGSEKVTMQNLNDRLAS YLDKVRALEEANADL 

EVKIRPWYQRQRPSEIKX)YSPYFKTIEDLRNKIIA 

ATIENAQPILQIDNARLAADDFRTKYEHELALRQ 

TVEADVNGLRRVLDELTLARTDLEMQIEGLKEE 

LA YLRKNH* EEML ALRGQTGGE VNVETDAAPG 

VDLSCILNEMRNQYEQMAEKNRRDAETWFLSKT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A^AIanine OCystdne, D=Aspartic Acid, 
E«=Glutnmic Acid, ^Phenylalanine, G=Ghycine t H=Histidinc, 
I=Isoleucine, K=*Lysine, L^Leucine, M=Methtonine, 
N»Asparagine, P=ProJine, Q=GIutamine, R»Arginine, S=S erine, 
T^Threonlne, V=Vallne, W=Tryptophan, Y=Tyrosine, 
X«Unknown, *«=Stop codon, ^possible nndeotide deletion, 
\=possibIe nudeotide insertion 










EELNKEVASNSELVQSSRSEVTELRRVLQGLEIEL 

QSQLSMKASLENSLEETKGRYCMQLSQIQGL1GS 

VEEQLAQLRCEMEQQSQEYQILLDVKTRLEQEIA 

TYRRLLEGEDAHLSSQQASGQSYSSREVFTSSSSS 

SSRQTRPDLKEQSSSSFSQGQSS 


3004 


A 


2 


940 


GCAPDTRFFVPEPGGRGAAPWVALVARGGCTFK I 

DKVLVAARRNASAVVLYNEERYGNITLPMSHAG 

TGNIVVIMISYPKGREILELVQKGIPVTMT1GVGT 

RHVQEFISGQSVVFVAIAFITMMUSI^WLIFYYIQ 

RFLYTGSQIGSQSHRKKnOCVIGQLLLHTVKHGE 

KGIDVDAENCAVCIENFKVKDIIRI1JPCKHIFHRIC 

IDPWLLDHRTCPMCKLDVIKALGYWGEPGDVQE 

MPAPESPPGRDPAANLSLALPDDDGSDESSPPSA 

SPAESEPQCDPSFKGDAGENTALLEAGRSDSRHG 

GP1S 


3005 


A 


184 


2552 


TNniHQFLLLFLFWVCLPHFCSPEIMFRRTPVPQQ 

RILSSRVPRSDGKILHRQKRGWMWNQFFLLEEY 

TGSDYQYVGKLHSDQDKGDGSLKYILSGDGAGT 

LHroEKTGDIHATRRIDREEKAFYTLRAQAlNRR 

TLRPVEPESEr^IKIHDINDNEPTFPEEIYTASVPE 

MS WGTS WQVTATDADDPSYGNSARVIYS ILQ 

GQPYFSVEPETGHRTALPNMNRENREQYQWIQ 

AKDMGGQMGGLSGTTTVNITLTDVNDNPPRFPQ 

NTIHLRVLESSPVGTAIGSVKATDADTGKNAEVE 

YRHDGDGTDMFDIVTEKDTQEGUTVKKPLDYES 

RRLYTLKVEAENTHVDPRFYYLGPFKDTTTVK1SI 

EDVDEPPVFSRSSYLFEVHED1EVGTIIGTVMARD 

PDSISSPIRFSLDRHTDLDRIFNIHSGNGSLYTSKP 

LDRELSQWrD^LTVlAAEINNPKETTRVAVFVRlL 

DANDNAPQFAVFYDTFVCENARPGQLIQTISAVD 

KDDPLGGQKKTSLAAVOTNFTVQDNEDNTARIL 

TRKNGFNRHEISTYLLPVVISDNDYPIQSSTGTLTI 

RVCACDSQGNMQSCSAEALLLPAGLSTGALIAIL 

LCmLLVIVVLFAALKRQRKKEPLILSKEDIRDNIV 

SYNDEGGGEEDTQAFDIGTLRNPAAIEEKKLRRD 

IIPETLFIPRRTPTA PDNTDVRDFINERLKEHDLDP 

TAPPYDSLATYAYEGNDSIAESLSSLESGTTEGD 
QNYDYLREWGPRFNKLPQKYGGGESDKDS 


3006 


A 


2 


541 


GRVDKTWWGKS VGIMLTELEKALNS IID VYHK Y 

SLKGNFHAVYRDDLKKLLETECPQYIRKKGAD 

VWFKEUDINTOGAVNFQEFLILVIKMGVAALNSn 

DVYHKYSLKGNFHAVYRDDLQKLLETECPQYI 

RKKGADVWFKELDINTDGAVNFQEFLILVDCMG 

VGSPQKKVASYF 


3007 


A 


1 


1253 


MYEGIRCLLKALLGFVSLAIGTLYCPRQYRPFPG 

SLGIEAIhrWEPIPDSYYRDMATWPTHAPSVEEG 

GQGRFGNQADHFLGSLAFAKLLNRSLAVPSWIE 

YQHHKPPFTNLHVSYQKYFKLEPLQAYHRVISLE 

DFMEKLAPTHWPPEKRVAYCFEVAAQRSPDKKT 

CPMKEGNPFGPFWDQFHVSFNKSELFTGISFSAS 

YREQWSQRFSPKEHPVLALPGAPAQFPVLEEHRP 

LQKYMVWSDEMVKTGEAQIHAHLVRPYVGIHL 

RIGSDWKNACAMLKDGTAGSHFMASPQCVGYS 

RSTAAPLTMTMCLPDLKEIQRAVXLWVRSLDAQ 

SVYVATDSESYVPELQQLFKGKVKWSLKPEVA 
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SEQ ED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«AJanlne C=Cysteine y D=Aspartic Acid, 
E=Glutamk Add, F«Phenylalanine, Glycine, H=Hijttdine, 
I«Isoleucine, K^Lystne, U=Leucine, M=Methionine, 
N=Asparagine f P=Proline, Q=G lata mint, R^Arginlne, S-Serine, 
T«Threonine, V»Vallne, W=Tryptophao, Y«Tyrosine, 
X=Unknown, *«Stop codon, /^possible nudeotide deletion, 
\«possib!e nudeotide insertion 










QVDLYILGQADHFIGNCVSSFTAFVKRERDLQGR 
PSSFFGMDRPPKLRDEF 


3008 


A 


3136 


1898 

• 


TARGGGSEPGPTMAj^WSSTSTRWSHVKVKTS S 

QPGF^ERLSETSGGMFVGLMAFLLSFYLIFTNEG 

RALKTATSLAEGLSLVVSPDSIHSVAPENEGRLV 

HEGALRTSKJLLSDPNYGVHLPAVKLRRHVEMY 

QWV^TEESREYTODGQVKKETRYSYNTEWRSEn 

NSKNFDREIGHKNPRAMAGESFMATAPFVQIGRF 

FLSSGLIDKVDNFKSLSLSKLEDPHVDIIRRGDFF 

YHSENPKYPEVGDLRVSFSYAGLSGDDPDLGPA 

HWTVIARQRGDQLVPFSTKSGDTLLLLHHGDFS 

AEEVFHRELRSNSMKTWGLRAAGWMAMFMGL 

NLMTRILYTLVDWFPVFRDLVNIGLKAFAFCVAT 

SLTLLTVAAGWLFYRPLWALLIAGLALVPDLVAR 

TRVPAKKLE 


3009 


A 


93 


659 


DAAVAMTAQGGLVANRGRRFKWAIELSGPGGG 

SRGRSDRGSGQGDSLYPVGYLDKQVPDTSVQET 

DRILVEKRCWDIALGPLKQIPMNLFIMYMAGNTI 

SIFPTMMVCMMAWRPIQAJ^MAISATFKMLESSS 

QKFLQGLVYLIGNLMGLALAVYKCQSMGLLPTH 

ASDWLAFIEPPERMEFSGGGLLL 


3010 


A 

• 


2 


1041 


LIDSAKARYWTQRGTWVYDNALLLLLKCLWSN 

VWECTMASSNTVLMRLVASAYS1AQKAGMIVR 

RV1AEGDLGIVEKTCATDLQTKADRLAQMSICSS 

LARKFPKLTIIGEEDLPSEEVDQELTEDSQWEEILK 

QPCPSQYSAIKEEDLWWVDPLDGTKEYTEGLL 

DNVTVLIGIAYEGKAIAGVINQPYYNYEAGPDAV 

LGRTlWGVLGLGAFGFQLKEVPAGKHnTTTRSH 

SNKI.VTDCVAAMNPDAVLRVGGAGNKIIQLIEG 

KASAYVFASPGCKKWDTCAPEVILHAVGGKLTD 

IHGNVLQYHKDVKHMNSAGVIJVTLRNYDYYAS 

RVPESUCNALVP 


3011 


A 

• 


291 


1452 

• 


SPQKTMRSHTITMTTTSVSSWPYSSHRMRFITNH 

SDQPPQNFSATPNVTTCPMDEKIJLSTVLTTSYSVI 

FIVGLVGNIIAiYVFLGIHRKRNSIQIYLLNVAIAD 

LLLIFCLPFRIMYHINQNKWTL G VILCK WGTLFY 

MNMYISIILLGFISLDRYIKINMIQQRKAITTKQSI 

YVCCIVWMLALGGFLTMIILTLKKGGHNSTMCF 

HYRDKHNAKGEAIFKrTLVVMFWLIFLLIILSYIKl 

GKNLLRISKRRSKFPNSGKYATTARNSFIVLIIFTI 

CFVPYHAFRFIYISSQLNVS SCYWKEIVHKTNEIM 

LVI^SFNSCXDPVMYFLMSSNIRKIMCQLLFRRF 

QGEPSRSESTSEFKPGYSLHDTSVAVKIQSSSKST 


3012 


A 


246 


1346 


TEPVGYTKAEEPIAMRSLGALLLLLSACLAVSAG 

PVPTPPDMQVQENFNISRTYGKWYNLAIGSTCPW 

1JCKIMDRMTVSTLVLGEGATEAEISMTSTRWRK 

GVCEETSGAYEKTDTDGKFLYHKSKWNITMESY 

WHThTVT^EYAIFLTKKFSRHHGPTITAKjLYGRAP 

QUUHIXQDFRWAQGVGIPEDS1FTMADRGECV 

PGEQEPEP1LIPRVRRAVLPQEEEGSGGGQLVTEV 

TKKEDSCQLGYSAGPCMGMTSRYFYNGTSMAC 

ETFQYGGCMGNGNOTVTEKECLQTCRTVAACN 

LPIVRGPCRAFIQLWAFDAVKGKCVLFPYGGCQ 

GNGNKFYSEKECREYCGVPGDGDEELLRFSN 


3013 


A 


67 


379 


RQMALLKANKDLISAGLKEFSVLLNQQVFNDPL 
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SEQ ID 

NO: 


I Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


1 Predicted end 

nncleotide 

location 

corresponding 

to last amino 
1 acid residue of 

peptide 

sequence 


Amino ncirf smuciicp /AsAlanln* rWN>et»in» naAanarhV A/>IH • 

^Glutamic Add, {^Phenylalanine, G^Glycine, H=Histidine, 
l»Isoleucine, K»Lysine, L= Leucine, M«Methiontne, 
N^Asparagine, P=ProIine, Q=Glutaraine, R=Arginine, S»Serlne, 
T=Threonine, V«Valine, W»Tryptophan, Y«Tyrosine, 
X^Un known. *=Stoo eodnn /anmiihl* mirfentifl* dHefinn 
V 3 possible nucleotide insertion 










VSEEDMVTVVEDWMNFYINYYRQQVTGEPQER 

DKALQELRQELNTLANPFLAKYRDFLKSHELPSH 

PPPSS 


3014 


A 


1 


373 


GTSWSTLRAVMSASWSVVSRVLEEYLSSTPQRL 
KLLDAYLLYILLTGALQFGYCLFVLTFHFNSLLLF 
FFFCVGSFHSNVYFLLFTLSFLCFLFIAYFFLIRFFS 
LFIWFFHVFFIELSLFYF 


3015 


A 


2 

p 

1 


1321 


AAAEGTAPSPGRVSPPTPARGEPEVTVEIGETYLC 

RRPDSTWHSAEVIQSRVNDQEGREEFYVHYVGF 

NRIU.DEWVDKNRLALTKTVKDAVQKNSEKYLS 

ELAEQPERKITRNQKRKHDEINHVQKTYAEMDP 

TTAALEKEHEAITKVKYVDKIHIGNYEIDAWYFS 

PFPEDYGKQPKLWLCEYCLKYMKYEKSYRFHLG 

QCQWRQPPGKEIYRKSNISVYEVDGKDHKIYCQ 

NLCLLAKLFLDHKTLYFDVEPFVFYILTEVDRQG 

AHIVGYFSKEKESPDGNNVACILTLPPYQRRGYG 

KFL1AFSYELSKLESTVGSPEKPLSDLGKLSYRSY 

WSWVLLEILRDFRGTLSIKDLSQMTSITQNDIIST 

LQSLNMVK Y WKGQH VICVTPKLVEEHLKS AQ Y 

KKPPITGGWGAAVCRGRWGSVSIWTGRSQGLLI 

AVT 


3016 


A 


2 


1321 

• 


AAAEGTAPSPGRVSPPTPARGEPEVTVEIGETYLC 

RRPDSTWHSAEVIQSRVNDQEGREEFYVrTCVGF 

NRRLDEWVDKNRLALTKTVKDAVQKNSEKYLS 

ELAEQPERKITRNQKRKHDEINHVQKTYAEMDP 

TTAALEKEHEAITKVKYVDKIHIGNYEIDAWYFS 

PFPEDYGKQPKLWLCEYCLKYMKYEKSYRFHLG 

QCQWRQPPGKE1YRKSNISVYEVDGKDHKIYCQ 

NLCLLAKLFLDHKTLYFDVEPFVFYILTEVDRQG 

AHIVGYFSKEKESPDGNNVACILTLPPYQRRGYG 

KFLIAFSYELSKLESTVGSPEKPLSDLGKLSYRSY 

WSWVLLEILRDFRGTLSIKDLSQMTSITQNDnST 

LQSLNMVK YWKGQHVICVTPKLVEEHLKSAQY 

KKPPITGG WGAA VCRGRWG S VSIWTGRSQGLLI 

AVT 


3017 ; 


A 


38 


704 


EAHTOGQLGSERNGVRMDEDVLTTLKILnGESG 

VGKSSLLLRFTDDTFDPELAAUGVDFKVKTISVD 

GNKAKLAIWDTAGQERFRTLTPSYYRGAQGVIL 

VYDVTRRDTFVKLD>TWLNELETYCTRNDIVNM 

LVGNKIDKENREVDRNEGLKFARKHSMLFIEAS 

AKTCDGVQ'CAFEELVEKIIQTPGLWESENQNKG 

VKLSHREEGQGGGACGGYCSVL 


3018 


A 


2640 


2861 


APVLn.QMVKLSIVLlTQFLSHDQGQLTKELQQH 

VKSVTCPCEYLRKVSECRQMGPGALEQFPGLSC 

HTSHSG 




A f 

A 


1307 


71 1 


PGITMAASLVGKKIVFVTGNAKKLEEVVQILGDK 

FPCTLVAQKIDLPEYQGEPDEISIQKCQEAVRQV 

QGPVLVEDTCLCFNALGGLPGPYIKWFLEKLKPE 

GLHQLLAGFEDKSAYALCTFALSTGDPSQPVRLF 

RGRTSGRIVAPRGCQDFGWDPCFQPDGYEQTYA 

EMPKAEKN A VSHRFRALLELQEYFGSLAA 


3020 


A 


1202 


180 


VSCLPTSCKMTTLNNQDQPVPFNSSHPDEYKIAA 
LVFYSCIFTLGLFVNITALWWSCTTKKRTTVTIYM 
MNVALVDL1PIMTLPFRMFYYAKDEWPFGEYFC 
QILGALTVFYPSIALWLLAFISADRYMATVQPKY 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last ammo 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanioe OCysteinc, KKAspartic Acid, 
E*=ClutaiuIc Acid, ^Phenylalanine, OCIycine, H=Histidine, 
I»Isoleucine, K=Ly3ine, L=Leucine, M=Methlonine, 
N=Asparagine, P=Prollne, Q=Glutaraine f R=Arginine, S=Serine, 
T=Threonlne, V«Vallne, W-Tryptophan, Y«Tyroslne, 
X=Un known, *=Stop cod on, /^possible nucleotide deletion, 
\»p05Sible nucleotide insertion 






* 




AKEUCNTCKAVLACVGVWIMTLTTTTPLLLLYK 

DPDKDSTPATCLKISDUYLKAVNVLNLTRLTFFF 

LIPLFIMIG C YLVI1HNLLHGRTSKLKPK VXEKSIR1 

IITLLVQVLVCFMPFHJCFAFLMLGTGENSYNPW 

GA1-11FLMNLSTCLDVILYYIVSKQFQARVISVM 

LYKNYLRSMRRKSFRSGSLRSLSNINSEML 


3021 


A 


27 


1897 

d 

• m 

1 

• 


EEFCTWUVRVGEMETAPKPGKDVPPKKDKLQT 
KRKKPRRYWEEETVPTTAGASPGPPRNKKNREL 
RPQRPKNAYILKKSRISKKPQVPKKPREWKNPES 
QRGLSGAQDPFPGPAPVPVEVVQKFCRIDKSRKL 
PHSKAKTRSRLEVAEAEEEETSDCAARSELLLAEE 
PGFLEGEDGEDTAKICQADIVEAVDIASAAKHFD 
LNLRQFGPYRLNYSRTGRHLAFOGRRGHVAALD 
WVTKKLMCEINVMEAVRDIRFLHSEALLAVAQN 
. 'RWLfflYDNQGIELHCIRRCDRVTRLEFLPFHFLLA 
TASETGFLTYLDVSVGiOVAALNARAGRLDVMS 
QNPYNAVIHLGHSNGTVSLWSPAMKEPLAKILC 
HRGGVRAVAVDSTGTYMATSGLDHQLKIFDLRG 
TYQPLSTRTLPHGAGHLAFSQRGLLVAGMGDVV 
MWAGQGKASPPSLEQPYLTHRLSGPVHGLQFCP 
FEDVLGVGHTGGITSMLVPGAGEPNFDGLESNPY 
RSRKQRQEWEVKALLEKVPAELICLDPRALAEV 
DVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKG 
RSSTASLVKRKRKVMDEEHRDKVRQSLQQQHH 
KEAKAKPTGARPSALDKFVR 


3022 

i 

■ 


A 


I 

■ 


2249 


MTAQDSNTSAHAQRDGPELPASSSWRSFWPLSC 

LSSPPVSAVEVATEGRDREVAKVGQRFCDTTSGE 

LRQARDRDCCVRMP APVGRRSPPSPRS SM AA VA 

LRDSAQGMTFEDVAIYFSQEEWELLDESQRFLYC 

DVMLENFAHVTSLGYCHGMENEAIASEQSVSIQ 

VRTSKG>HTnXJKTHLSEIKMCWVLKDILPAAEH 

QTTSPVQKSYLGSTSMRGFCFSADLHQHQKHYN 

EEEPWKRKVDEATFVTGCRFHVLNYFTCGEAFP 

APTDLLQHEATPSGEEPHSSSSKHIQAFFNAKSYY 

KWGEYRKASSHKHTLVQHQSVCSEGGLYECSK 

CEKAFTCKNTLVQHQQIHTGQKMFECSECEESFS 

KKCHLILHKJIHTGERPYECSDREKAF1HKSEFIHH 

QRRHTGGVKHECGECRKTFSYKSNLIEHQRVHT 

GERPYECGECGKSFRQSSSLFRHQRVHSGERPYQ 

CCECGKSFRQIFNLIRHRRVHTGEMPYQCSDCGK 

SFSCKSELIQHQRIHSGERPYECRECGKSFRQFSN 

LIRHRSIHTGDRPYECSECEKSFSRKFILIQHQRVH 

TGERPYECSECGKSFTRKSDLIQHRRIHTGTRPYE 

GSECGKSFRQRSGLIQHRRLHTGERPYECSECGK 

SFSQSASLIQHQRVHTGERPYQCCECGKSFRQIFN 

LIRHRRVHTGEMPYQCSDCGKSFSCKSEUQHRRI 

HSGERPYECSECGKSFSRKSNLrRHRRVHTEERP 


3023 


A 

* 


3148 


634 


AAGALRCLAAFPRAEPASRGRQSSPARACAASR 

AERATAAAMAHRCLRLWGRGGCWPRGLQQLL 

VPGGVGPGEQPCLRTLYRFVTTQARASRNSLLTD 

IIAAYQRFCSRPPKGFGKYFPNGKNGKKASEPKB 

VMGEKKESKPAATTRSSGGGGGGGGKRGGKKD 

DSHWWSRFQKGDIP WDDKDFRMFFL WTALFW Q 

GVMFYLLLKRSGREITWKDFVNNYLSKGVVDRL 
EVVNKRFVRVTFTPGKTPVDGQYVWFNIGSVDT 
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SEQD> 
NO: 


Method 


Predicted 
begin niog 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»AJaniue OCysteine, D=Aspartic Add, 
E=Glutam!c Add, ^Phenylalanine, G= Glycine, H=Histidine, 
Islsoleucioe, K=Lysine, L= Leu cine, M=Methiooine, 
N=Asparagine f P*=ProUne, Q=GIu famine, R«Arginine, S^Scrine, 
T=Threonine, V=Valine, W»Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nocleotide deletion, 
^possible nucleotide insertion 




* 






FERNLETLQQELGIEGENRVPWYIAESDGSFLLS 

MLPTVLIIAFLLYTTRRGPAAIGRTGRGMGGLFSV 

GETTAKVLKDEIDVKFKDVAGCEEAKLEIN1EFV 

NFUCNPKQYQDLGAKIPKGAILTGPPGTGKTLLA 

KATAGEANVPFITVSGSEFLEMFVGVGPARVRDL 

FALARKNAPCILFIDEIDAVGRKRGRGNFGGQSE 

QENTLNQLLVEMDGFNTTThTVVILAGTNRPDILD 

PALLRPGRFDRQIFIGPPDIKGRASIFKVHLRPLKL 

DSTLEKDKLARKLASLTPGFSGADVANVCNEAA 

LIAARHLSDSINQKHFEQAIERVIGGLEKJCTQVLQ 

PEEKKTVAYHEAGHAVAGWYLEHADPLLKVSn 

PRGKGLGYAQYLPKEQYLYTKEQLLDRMCMTL 

GGRVSEEIFFGRITTGAQDDLRKVTQSAYAQIVQ 

FGMNEKVGQISFDLPRQGDMVLEKPYSEATARLI 

DDEVRILINDAYKRTVALLTEKKADVEKVALLL 

LEKEVLDKNDMVELLGPRPFAEKSTYEEFVEGT 

GSLDEDTSLPEGLKDWNKEREKEKEEPPGEKVA 

N 


3024 


A 


274 

• 


1455 


LRACSLPSMSALEKSMHLGRLPSRPPLPGSGGSQ 

SGAKMRMGPGRKRDFSPVPWSQYFESMEDVEV 

ENETGKDTFRVYKSGSEGPVLLLLHGGGHSALS 

WAVFTAAnSRVQCRIVALDLRSHGETKVKNPED 

LSAETMAKJ5VGNVVEAMYGDLPPPIMLIGHSMG 

GAIAVHTASSNLVPSLLGLCMIDVVEGTAMDAL 

NSMQNFLRGRPKTFKSLENAIEWSVKSGQIRNLE 

SARVSMVGQVKQCEGITSPEGSKSIVEGIIEEEEE 

DEEGSESISKRKKEDDMETKJCDHPYTWRIELAKT 

EKYWDGWFRGLSNLFLSCPIPKLLLLAGVDRLD 

KDLTIGQMQGKFQMQVLPQCGHAVHEDAPDKV 

AEAVATFLIRHRFAEPIGGFQCVFPGC 


3025 

♦ 


A 


621 


306 


YHGGQRGRAGGSFRSVQGWGGQLRNPFRTSKSL i 
S WKGLS SLLFPLYNLQMGRPRDRKELGRGHSPP 
HLEGPHMLPSGAARWRWLEAPVLVLEPLVLRPA 
AAPTP 


3026 


A 


1533 


454 


AKVPQSTREEKRENGLEARSPAINLMGFNVEEM 

YEAHAWIQRjDLSLQNHHIIENNHILYLGRKEHDIL 

SQLQKTSSVSITEIISPGRTELEIEGARADLIEVVM 

NIEDMLOCV QEEMARKKERGL WRSLGQWTIQQ 

QKTQDEMKEN11FLKCPVPPTQELLDQKKQFEKC 

GLQVLKVEKIDNEVLMAAFQRKKKMMEEKLHR 

QPVSHRLFQQVPYQFCNWCRVGFQRMYSTPCD 

PKYGAGIYFTKNLKNLAEKAKKJSAADKLIYVFE 

AEVLTGFFCQQHPLNIVPPPLSPGAIDGHDSWD 

NVSSPETFVIFSGMQAIPQYLWTCTQEYVQSQDY 

SSGPMRPFAQHPWRGFASGSPVD 


3027 


A 


179 


703 


PFHLCiASSN 1FKJLQVQTQESKAQKEVKMGFIFSK 

SMNESMKNQKEFMLMNARLQLERQL1MQSEMR 

ERQMAMQIAWSREFLKYFGTFFGLAAISLTAGAI 

KJKKKPAFLVPIVPLSFILTYQYDLGYGTLLERMK 

GEAEDBLETEKSKLQLPRGMITFESEEKARKEQSR 

FF1DK 


3028 


A 


876 


1226 


A VGKEPESS STWVRDREGHIRSRRSMKML WKLT 
DNKYEDCEVSATPARSSVRSQAPSLTLPLLLLSL 
QPAAKRGWDKLSPAQRPSLGFARRTRGRSCRER 
TWMLPSLVSEFLHRD 
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SEQ n> 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid rcsidne of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-Phenytalanine, G=Glydne, H=Hlstidine, 
I=lsoleudne, K=Lysine, L^Lencine, M=Methionine, 
N=Aspnrogine, P^Proline, Q=Glutamine, R-Arginine, Serine, 
T=Threonlne, V=Valine, "W»Tryptophan f Y-Tyrosine, 
X-Unknown, *-Stop eodon, /^possible nucleotide deletion, 
\=possi ble nudeo tide i nsertion 


3029 


A 


3 


1731 


FREGRFGSSCAVAAPLAGFQGLEECGYLAVDSPP 

SCWTPGGSNPAAPLPQALLPPRLPPTVLPFLGPGL 

SGELEMFTLPQKDFRAPTTCLGPTCMQDLGSSHG 

EDLEGECSRKLDQKLPELRGVGDPAMISSNTSYL 

SSRGRMKWFWDSAEEGYRTYHMDEYDEDKNP 

SGIINLGTSENKLCFDLLSWRLSQRDMQRVEPSL 

LQYADWRGHLFLREEVAKFLSFYCKSPVPLRPE 

NWVLNGGASI^SALATVLCEAGEAFLIPTPYYG 

AITQHVCLYG>nRLAYVYLDSEVTGLDTRPFQLT 

VEKLEMALREAHSEGVKVKGLBLISPQNPLGDVY 

SPEELQEYLWAKIUIIU.HVIVDEVYMLSVFEKSV 

GYRSVLSLERLPDPQRTHVMWATSKDFGMSGLR 

FGTLYTENQDVATAVASLCRYHGLSGLVQYQM 

AQLLRDRDWINQVYLPENHARLKAAHTYVSEEL 

RALGIPFLSRGAGFFIWVDLRKYLLKGTFEEEML 

LWRJRFLDNKVLLSFGKAFECKEPGWFRFVFSDQ 

VHRLCLGMQRVQQVLAGKSQVAEDPRPSQSQEP 

SDQRR 


3030 


A 

■ 


1 


584 


PWLPWSDGRAARSSRKCPRSRFPVQVGKMAVST 

VFSTSSLMLALSRHSLLSPLLSVTSFRRFYRGDSP 

TDSQKDMIEIPLPPWQERTDESIETKRARLLYESR 

KRGMLENCELLSLFAKEHLQHMTEKQLNLYDRLI 

NEPSNDWDIYYWATEAKPAPEEFENEVMALLRD 

FAKNKNKEQRLRAPDLEYLFEKPR 


3031 


A 


1177 


359 

* 


SLWPWILNfDDSLMQISLQLLCVYTANFPNGCSSL 

CWSSCGQHPVQATHRGAVSNSLMLCILKLASQM 

PLENTTVQQMVFMLLSNLALSHDCKGVIQKSNF 

LQNFLSLA LPKG GNKHL SNLTIL WLKLLLN I SS GE 

DGQQMILRLDGCLDLLTEMSKYKHKSSPLLPLLI 

FHNVCFSPANKPKILANEKVITVLAACLESENQN 

AQRIGAAALWALIYNYQKAKTALKSPSVKRRVD 

EAYSLAKKTFPNSEANPLNAYYLKCLENLVQLL 

NSS 


3032 


A 

* 


2 


1242 

- 


GISGRPPRPAKRRMGKNPVRPPRALPPVPSQDDIP 

LSRPKKKKPRTKNTPASASLEGLAQTAGRRPSEG 

NEPSTKELKEHPE APVQRRQKKTRLPLELETS ST 

QKKSSSSSLLRNENGIDAEPAEEAVIQKPRRKTK 

KTQPAELQYANELGVEDEDHTDEQTTVEQQSVF 

TAPTGISQPVGKVFVEKSRRFQAADRSELIKTl'EN 

IDVSMDVKPSWTTRDVALTVHRAFRMIGLFSHG 

rXAGCAVWNIVVIYVLAGDQLSNLSNIXQQYKT 

LAYPFQSLLYLLLALSTISAFDRIDFAKISVAIRNF 

XJUJDPTALASrXYFTALILSLSQQMTSDRIHLYTP 

SSVNGSLWEAGIEEQILQPWIVVKLWALLVGLS 

WLFLSYRPGMDLSEELMFSSEVEEYPDKEKEIKA 

SS 


3033 


A 

• 


3 


1436 


TATSGGrWLRRKWRCHWPRPLPQSCVGTEGGLQ 

VRDTSSR1AKGGVDHTKMSLHGASGGHERSRDR 

RRSSDRSRDSSHERTESQLTPCIRNVTSPTRQHHV 

EREKDHSSSRPSSPRPQKASPNGSISSAGNSSRNS 

SQSSSDGSCKTAGEMVFVYENAKEGARNIRTSER 

VTLIVDNTRFVVDPSIFTAQPNTMLGRMFGSGRE 

HNFTRPNEKGEYEVAEGIGSTVFRAILDYYKTGn 

RCPDGISIPELREACDYLCISFEYSTIKCRDLSALM 

HELSNDGARRQFEFYLEEMILPLMVASAQSGERE 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A»Alanine OCysteine, P-Aspartic Add, 
E=ClutaraIc Add, ^Phenylalanine, G=Glycine, H-Histidine, 
I«Isoleudne, K=Lysine, L=Leudne, M~Methionine, 
N=Asparagint, P-Proline, Q^GIutamine, R-Arginine, S^Serine, 
T=Threonine, V«Vallne, W«Tryptophan, Y-Tyrosine, 
X— Unknown, *=*Stop codon, /-possible nudeotide deletion, 
V=possible nucleotide insertion 










CHIWLTDDDVVDWDEEYPPQMGEEYSQIIYSTK 

LYRFFKYIENRDVAKSVLKERGLKKIRLGIEGYP 

TYKEKVKKRPGGRPEVFYNYVQRPFIRMSWEKE 

EGKSRHVDFQCVKSKSITNLAAAAADIPQDQLV 

VMHFITQVDELDILPIHPPSGNSDLDPDAQNPML 


3034 


A 

* 


3 


1972 

• 


SSLAQHRSVAVLGWPAGWAAARARPAMQGGN 

SGVRKREEEGDGAGAVAAPPAIDFPAEGPDPEY 

DESDVPAEIQVLKEPLQQPTFPFAVANQLLLVSL 

LEHLSHVHEPNPLRSRQVFKLLCQTFIKMGLLSSF i 

TCSDEFSSLRLHHNRAITHLMRSAKERVRQDPCE 

DISRIQKIRSREVALEAQTSRYLNEFEELAILGKG 

GYGRVYKVRNKLDGQYYAIKKJXIKGATKTVCM 

KVLREVKVLAGLQHPNIVGYHTAWIEHVHVIQP 

RADRAAIELPSLEVLSDQEEDREQCGVKNDESSS 

SSIIFAEPTPEKEKRFGESDTENQKNKSVKYTTNL 

VIRESGELESTLELQENGLAGLSASSIVEQQLPLR 

RNSHLEESFTSTEESSEENVNFLGQTEAQYHLML 

HIQMQLCELSLWDW1VERNKRGREYVDESACPY 

VMANVATKJFQELVEGVFYMNMGIVHRDLKPR 

MFLHGPDQQVKIGDFGLACTDILQKNTDWTNR 

NGKRTPTHTSRVGTCLYASPEQLEGSEYDAKSD 

MYSLGWLLELFQPFGTEMERAEVLTGLRTGQL 

PESLRKRCPVQAKYIQHLTRRNSSQRPSAIQLLQS 

ELFQN SGNVNLTLQMKI EEQEKEIAELKKQLNLL 

SQDKGVRDDGKDGGVG 


3035 

■ 


A 


110 


1172 

• 


KLSCPCSHGTRVTAVRGPRLKAGVQWHDLGSLQ 

PPPSGLKQSSHLSLSSSWDFRHAPTHPETYTCPK 

M1EMEQAEAQLAELDLLASMFPGENELIVNDQL 

A V AELKDCIEKKTMEGRSSKVY FT1NMNLDVSD 

EKMAMFSLACILPFKYPAVLPEITVRSVLLSRSQQ 

TQLNTDLTAFLQKHCHGDVCILNATEWVREHAS 

GYVSRDTSSSPTTGSTVQSVDLl^"rRLWIYSHHIY 

NKCKRKNILEWAKELSLSGFSMPGKPGVVCVEG 

PQSACEEFWARLRKLNWKRILIRHREDIPFDGTN 

DETERQRKFSIFEEKVFSVNGARGNHMDFGQLY 

QFLNTKGCGDVFQMFLWV 


3036 

* 

• 


A 


1 


2288 


FRFAERRAAAAESDVSAKMAGRSMQAARCPTD 

ELSLTNCAVVNEKDFQSGQHVTVRTSPNHRYTFT 

LKTHPSWPGS1AFSLPQRKWAGLSIGQEIEVSLY 

TFDKAKQCIGTMTEBIDFLQICKSIDSNPYDTDKM 

AAEFIQQFNNQAFSVGQQLVFSFNEKLFGLLVKD 

IEAMDPSILNGEPATGKRQKIEVGLWGNSQVAF 

EKAENSSLl^IGKAKTXENRQSIINPDWNFKKMG 

IGGLDKEFSDIFRRAFASRVFPPEIVEQMGCKHVK 

GBLLYGPPGCGKTLLARQIGKMLNAREPKWNG 

PEILNKYVGESEANIRKLFADAEEEQRRLGANSG 

LHIIIFDEIDAICKQRGSMAGSTGVHDTVVNQLLS 

KIDGVEQLNNILVIGMTNRPDLIDEALLRPGRLEV 

KMEIGLPDEKGRLQILHIHTARMRGHQLLSADV 

DIKE1AVETKNFSGAELEGLVRAAQSTAMNRHI 

KASTKVEVDMEKAESLQVTRGDFLASLENDIKP 

AFGTNQEDYASYIMNGIIKWGDPVTRVLDDGEL 

LVQQTKNSDRTPLVSVLLEGPPHSGKTALAAKIA 

EESNFPFIKICSPDKMIGFSETAKCQAMKKIFDDA 

YKSQLSCVWDDIERLLDYVPIGPRFSNLVLQAJL 
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NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A— Alanine OCysteine, D=Aspartic Add, 
E=Glutnmic Add, ^Phenylalanine, OCIycine, H=Histidine, 
I=Isoleudne, K«Lysine, L=Leudne, M=Methlonine, 
N-Asparagine, P*»Prollne, Q«*Glu famine, R=Arginine, S=Serine, 
Threonine, V«VaIine, W=Tryptophan, V=Tyrosine, 
X^Unkoown, **=Stop codon, /^possible nudeotide deletion, 
\=possibie nudeotide insertion 










LVLLKXAPPQGIUQ^LUGTTSRKDVLQEMEMLNA 
FSTTIHVPNIATGEQLLEALELLGKFTCDKERTTIA 
QQVKGKKVWIGIKKLLMLIEMSLQMDPEYRVRK 
FLALLREEGASPLDFD 


3037 


A 


1 


1347 


MLDTGSEHLNRILKALPALQSAGSEGQNG SAESL 

GEGGTRDSDRARRKLRGGNKEIPTTYPCLVVRSP 

VTASDLRGTQDFAAYHGLSLILEPLGACNRLSVC 

VPVHSPPGMRVSPRSPSLRTLVIDPAEPAGAQRL 

RFSGKERSGEAGSAVEGLAVAVSMGDGGAERD 

RGPARRAESGGGGGRCGDRSGAGDLRADGGGH 

SPTEVAGTSASSPAGSRESGADSDGQPGPGEADH 

CRRILVRD AKGTIREIVLPKGLDLDRPKRTR TFFT 

AEQLYRLEMEFQRCQYVVGRERTELARQLNLSE 

TQVKVWFQNRRTKQKKDQSRDLEKRASSSASEA 

FATSNILRLLEQGRLLSWRAPSLLALTPSLPGLP 

ASHRGTSLGDPRNSSPRLNPLSSASASPPLPPPLP 

AVCFSSAPLLDLPAGYELGSSAFEPYSWLERKVG 

SASSCKKANT 


3038 


A 


924 


501 


TELLPLCSRSGPKPQSGDPLLQLAQQARPRLSGE 
RLETAPSLLLSRMACVISGWALSRGARTWTWAT 
PTGPVHRAQPAIRSLSAEGALTRLKEEKWPGRY1 j 
LPNHLTPPFLYKHLGSVPPSHWRSPLISHSVNILA 
LNWR 


3039 


A 


1263 

• 


111 

■ 


ACGIRHEGALPGLTATPEAMLRFLPDLAFSFLHL 

ALGQAVQFQEYVFLQFLGLDKAPSPQKFQPVPYI 

LKKIFQDREAAATTGVSRDLCYVKELGVRGNVL 

RFLPDQGFFLYPKK1SQASSCLQKLLYFNLSAIKE 

REQLTLAQLGLDLGPNSYYNLGPELELALFLVQE 

PHVWGQTTPKPGKMFVLRSVPWPQGAVHFNLL 

DVAKDWNDNPRKNFGLFLEILVKEDRDSGVNFQ 

PEDTCARLRCSLHASLLVVTLNPDQCHPSRKRRA 

AIPWKLSCKNLCHRHQLFINFRDLGWHKWIIAP 

KGFMANYCHGECPFSLTTSLNSSNYAFMQALMH 

AVDPEIPQAVCIPTKLSPISN1LYQDNNDNVILRHY 

EDMVVDECGCG 


3040 


A 

• 


15 


849 


ASRLPRGPGCGADMRPLLGLLLVFAGCTFALYL 

LSTRLPRGRRJLGSTEEAGGRSLWFPSDLAELREL 

SEVLREYRKEHQAYVFLLFCGAYLYKQGFAIPGS 

SFLNVLAGALFGPWLGLLLCCVLTSVGATCCYL 

LSSIFGKQLWSYFPDKVALLQRKVEENRNSLFF 

FIXr^RLFPMTPNWFLNLSAPILNIPIVQFFFSVLI 

GLIPYNFICVQTGSILSTLTSLDALFSWDTVFKLL 

AIAMVALIPGTLIKKFSQKHLQLNETSTANHIHSR 

KDT 


3041 

« 


A 


1015 


175 


GLKRRRLCFAKVGDVLGCLSLPPSRSARVLEDISI 

LSCISVDSRIVRTKVPCSVTMSRPRKRLAGTSGSD 

KGLSGKRTKTENSGEALAKVEDSNPQKTSATKN 

CIXNLSSHWLMKSEPESRLEKGVDVKFSIEDLKA 

QPKQTTCWDGVRNYQARNFLRAMKLGEEAFFY 

HSNCKEPGIAGLMKIVKEAYPDHTQFEKNNPHY | 

DPSSKJSDNPKWSNTV^VQFVRMMKRFff^ 

YHQAHKATGGPLKNMVLFTRQRLS1QPLTQEEF 
DFVLSLEEKEPS 


3042 


A 


1015 


175 


GLKRRRLCFAKVGDVLGCLSLPPSRSARVLEDISI 
I^CISVDSRIVRTKVPCSVTMSRPRKRLAGTSGSD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»A!anine OCysteJne, D-Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine. 
I-Isoleucine, K^Lysine, L^Leucine, M»Methionine, 
N~Asparagine, P»Proline f Q«Glutamine, R=*Arginine, S=Serine, 
T^Threonlne, V«=VaIine, W-Tryptophan, Y-Tyroslne, 
X**Unknown, *=-Stop codoo, /"possible nucleotide deletion, 
V=possible nucleotide insertion 










KGLSGKRTKTENSGEALAKVEDSNPQKTSATKN 

CLK^TLSSHWLMKSEPESRLEKGVDVKFSEEDLKA 

QPKQTTCWDGVRNYQAIOaFUlAMKLGEEAFFY 

HSNCKEPGIAGLMKTVKEAYPDHTQFEKNNPHY 

DPSSKTONPKWSMVDVQFVRMMKRFIPLAELKS 

YHQAHKATGGPLKNMVLFTRQRLSIQPLTQEEF 

DFVLSLEEKEPS 


3043 


A 


153 


1133 


VGTAPAPGGRDRAPAMGSFQLEDFAAGWIGGA 

ASV1VGHPLDTVKTRLQA G VGYGNTLSCIRV VY 

RRESMFGFFKGMSFPLASIAVYNSVVFGVFSNTQ 

RFLSQHRCGEPEASPPRTLSDLLLASMVAGWSV 

GIXJGPVDLIKIRLQMQTQPFRDANLGLKSRAVAP 

AEQPAYQGPVHCITTIVRNEGLAGLYRGASAML 

LRDVPGYCLYFIPYVFLSEWITPEACTGPSPCAV 

WLAGGMAGAISWGTATPMDWKSRLQADGVY 

LNKYKGVLDCISQSYQKEGLKVFFRGITVNAVR 

GFPMSAAMFLGYELSLQAIRGDHAVTSP 


3044 


A 


41 


1316 


PPLGAGAGIHARSPHPARRLRLTAAGVGGRASG 

LLPTPWRRHHGPSGAAPYPAARLWQGPWRCRR 

PQPMAQRYDELPHYPGIADGPAALAGFPEAVPA 

APGPYGPHRPPQPLPPGLDSDGLKRDKDEIYGHP 

LFPLLALGFEKCELATCSPRDGAGAGLGTPRGGD 

VCSSDSFNEDNTAFAKQVCSERPFSSNPELDNLM 

IQAIQVLRFHLLELEKGKMPIDLVIEDRDGGCRE 

DFEDYPAPCPSLPDQNNIWIRDHEDSGSVHLGTP 

GPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGE 

DEDLrX^EPRIWKKRGIFPKVAT^^^^RAWLFQHL 

SHPYPSEEQKKQLAQDTGLTBLQVNNWFINARRR 

IVQPMIDQSNRTGQGAAFSPEGQPIGGYTETEPH 

VAFRAPASVGMSLNSEGEWHYL 


3045 


A 


3 


967 


VAHTQWHTCQRLSQLTHRSILKYLLIDTHACQV 

LILKHTHASLSLPSCQECFPSSIPSASHMVSHPHPP 

PSPRWGQTPEGLPAASPCGPGPRSCFSSILPTGDS 

WGMLACLCTVLWHLPAVPALNRTGDPGPGPSIQ 

KTYDLTRYLEHQLRSLAGTYLNYLGPPFNEPDFN 

PPRLGAETU^RATVDLEVWRSLNDKLRLTQNYE 

AYSHLLCYLRGLNRQAATAELRRSLAHFCTSLQ 

GLLGS1AGVMAALGYPLPQPLPGTEPTWTPGPAH 

SDFLQKMDDFWLLKELQTWLWRSAKDFNRLKJC 

KMQPPAAAVTLHLGAHGF 


3046 


A 


1185 


1584 


MYAYMYICTHICICAYRGIHIDVYLYMCIYIHIWI 
HTYLCVHIYVYVYICTfflCMCIHTYVYVYTYM^ 
VYTYICLCVYICIXiVHIYLCVYIHMYMCTmCM 
IHTYVHMCICSrYIHMYTCVYVYT^ 


3047 


A 


811 


132 


SLDLLGPIGILQEGRDPGTQGPQEKEKQMPASPM 

OTDAHLDINFKEGLKKERSYTGQFEANVRDEER 

QCGCGWPDSLLMKVLSQRLDQQDCIQKGWVL 

HG VPRDLDQ AHLLNRLG YNPNREFFLNVPFD S I 

MERLTLRRIDPVTGERYHLMYKPPPTMEIQARLL 

QNPKDAEEQVKIJCMDIJFYRNSADLEQLYGSAIT 

LNGDQDPYTVFEYIESGnNPLPKKIP 


3048 


A 


2 


1166 


RPRRGQGLVQEVQTENVTVAEGGVAEITCRLHQ 
YIX3SrNTS^QWARQTLJTFNGTRALIG)ERFQLEEFS 
PRRVRTRLSDARLEDEG G YFCQLYTEDTHHQIAT 
LTVLVAPENPWEVREQAVEGGEVELSCLVPRSR 



237 



WO 01/57190 PCT/US01/04098 



SEQID 
NO: 


Method 


Predicted ] 

beginning 

nucleotide 

location 

corresponding 

to first amino 

Hf\A rMiHlir of 

UtlU 1 Colli UL Ul 

peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
aeid residue of 

■»# nfi Hp 

sequence 


Amino add sequence (A= Alanine CXystdne, B=Aspartic Add, 
E=C!utamic Acid, F°Phenytalanine> C=C!yclnc, H^Hlstidine, 
Msoleucine, K^Lysine, L^Leudnc, M B> MettiioDine 9 
N**Asparagine, P=»ProIine t Q=Glutamine, ft»Arginine, S=Serine, 
T^Threonlnc, V*=Valine, W»Tryptophao, Y«Tyrosine, 
X=Un known, *=Stop cod on, /<= possible nucleotide deletion, j 

r^|niS3iuic iiuucuuuc iusa uuu 










PAATLRWYRDRKELKGVSSSQENGKVWSVAST 

VRFRVDRKDDGGmCEAQNQALPSGHSKQTQYV 

LDVQYSPTARWASQAVVREGDTLVLTCAVTGN 

PRPNQniWNRGNESLPERAEAVGETLTLPGLVSA 

DNGTYTCEASNKHGHARALYVLVVYGESRLRPT 

EGGGGAPDPGAWEAQTSVPYAJVGGILALLVFL 

IICVLVGMVWCSVRQKGSYLTHEASGLDEQGEA 

REAFLNGSDGHKRKEEFFI 


3049 

* 


A 


3159 


882 


VGCTLRVGVMAAAGSRKRRLAELTVDEFLASGF 

DSESESESENSPQAETREAREAARSPDKPGGSPSA 

SRRKGRASEHKDQLSRLKDRDPEFYKFLQENDQ 

SLLNFSDSDSSEEEEGPFHSLPDVLEEASEEEDGA 

EEGEDGDRVPRGLKGKKNSVPVTVAMVERWKQ 

AAKQRLTPKLFHEVVQAFRAAVAl'I'RGDQESAE 

ANKFQVTDSAAFNALVTFCIRDLIGCLQKLLFGK 

VAKDSSRMLQPS SSPLWGKLRVDIKA YLGSAIQL 

VSCLSETTVLAAVLRfflSVLVPCFLTFPKQCRML 

LKRMVVVWSTGEESLRVLAFLVLSRVCRHKKDT 

FLGPVLKQMYITYVRNCKrTSPGALPFISFMQWT 

LTELLALEPGVAYQHAFLYIRQLAIHLRNAMTTR 

KKETYQSVYNWQYVHCLFLWCRVLSTAGPSEA 

LQPLVYPLAQVEGCIKIJPTARFYPLRMHCIRALT 

LLSGSSGAFIPVLPFILEMFQQVDFNRKPGRMSSK 

PIOTSVILKl^m^lLQEKAYRDGLVEQLYDLTLE 

YLHSQAHCIGFPELVLPWLQLKSFLRECKVANY 

CRQVQQLLGKVQENSAYICSRRQRVSFGVSEQQ 

AVEAWEKLTREEGTPLTLYYSHWRKLRDRE1QL 

EISGKERLEDLNFPE1KRRKMADRKDEDRKQFKD 

LFDLNSSEEDDTEGFSERGILRPLSTRHGVEDDEE 

DEEEGEEDSSNSEDGDPDAEAGLAPGELQQLAQ 

GPEDELEDLQLSEDD 


3050 

* 


A 


870 


182 


HLDRYIKSPGSGSSTPAPPSHLLLYLLHPQSTRTM 

GCCGCSRGCG SGCGGCGSSCGGCG SGCGGCGSG 

RGGCGSGCGGCSSSCGGCGSRCYVPVCCCKPVC 

SWVPACSCTSCGSCGGSKGGCGSCGGSKGGCGS 

CGCSQSSCCKPCCCSSGCGSSCSQSSCCKPCCCSS 

GCGSSCCQSSCCKPYCCQSSCCKPCSCFSGCGSS 

CCQSSCYKPCCCQSSCCVPVCCQCKI 


3051 


A 


175 


4330 


NIPRWNFQGKSFG WL VHFSSEE VDMA SDSPARS 

LDEIDLSALRDPAGIFELVELVGNGTYGQVYKGR 

HVKTGQLAAIKVMDVTGDEEEEIKQEINMLKKY 

SHHRN1ATYYGAFIKKNPPGMDDQLWLVMEFCG 

AGSVTDHKNTKGYTLKEEWIAYICREILRGLSHL 

HQHKVIHRJDDCGQNVLLTENAEVKLVDFGVSAQ 

LDRTVGRRNTFIGTPYWMAPEYIACDENPDATY 

DFKSDLWSLGITAEEMAEGAPPLCDMHPMRALF 

LIPRNPAPRLKSKKWSKKFQSFIESCLVKNHSQRP 

ATEQLMKHPFIRDQPNERQVRJQIJCDHIDRTKKK 

RGEKDETEYEYSGSEEEEEENDSGEPSSILNLPGE 

STLRRDFLRLQLANKERSEALRRQQLEQQQREN 

EEHKRQLL AERQICRIEE QKEQRRRLEEQQRREKE 

LRKQQEREQRRHYEEQMRREEERRRAEHEQEYI 

RRQLEEEQRQLEILQQQLLHEQALLLEYKRKQLE 

EQRQAERLQRQLKQERDYLVSLQHQRQEQRPVE 

KKPL YHYKEGMSPSEKPA WAKEVEERSRLNRQ S 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted cod 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCystelne, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G*=Clycine, H=Histidine, 
I-Isoleudnc, K»Lysinc, L-Leucine, M<=>Metbionlnc, 
N-Asparagine, P-Prollnc, Q=G!utamine, R«Arginine, S=Serine, 
T=Threonine, V=Valinc, W-Try ptopban, Y»Tyrosine, 
X=Unknown, *>£top codon, A=possib!e nucleotide deletion, 
Wposslble nucleotide insertion 


• 

4 
• 




• 


• 


SPAMPHKVANRISDPNLPPRSESFSISGVQPARTP 

TMLRPVDPQIPHLVAVKSQGPALTASQSVHEQPT 

KGLSGFQEALNVTSHRVEMPRQNSDPTSENPPLP 

TRIEKFDRSSWLRQEEDIPPKVPQRTTSISPALAR 

KNSPGNGSALGPRLGSQPIRASNPDLRRTEPILES 

PLQRTSSGSSSSSSTPSSQPSSQGGSQPGSQAGSSE 

RTRVRANSKSEGSPVLPHEPAKVKPEESRDITRPS 

RPASYKKAIDEDLTALAKELRELRIEETNRPMKK 

VTDYSSSSEESESSEEEEEDGESETHDGTVAVSDI 

PRLIPTGAPG SNEQYNVGMVGTHGLETSHADSFS 

GSISREGTLMIRETSGEKKRSGHSDSNGFAGHINL 

PDLVQQSHSPAGTPTEGLGRVSTHSQEMDSGTE 

YGMGSSTKASFTPFVDPRVYQTSPTDEDEEDEES 

SAAALFTSELLRQEQAKLNEARKISVVKVNPTNI 

RPHSDTPEIRKYKJCRFNSEILCAALWGVNLLVGT 

ENGLMLLDRSGQGKVYNLINRRRFQQMDVLEG 

LNVLVTISGKKNKLR\ryYl^WLRNRILHNDPEV 

EKKQGWITVGDLEGCIHYKVVKYERIKFLVIALK 

N A VEIY A WAPKP YHKFMAFKS F ADLQHKPLL VD 

LTVEEGQRLK V1FG SHTGFH V1D VDSGNSYDIYIP 

SfflQGNTTPHAIVILPKTDGMEMLVCYEDEGVYV 

NTYGR1TKDVVLQWGEMPTSVAYIHSNQIMGW 

GEKAIEIRSVETGHLEXjVFMHXRAQRLKFLCERN 

DKVFFAS VRSGGS SQVFFMTLNRNSMMNW 


3052 


A 


1 


615 


MGQVECGGQKLGNQLEDDSEPAEGKVYSSDEE ! 

KLEASAGDPAGSEQEEEGSGGDSEDDGFLDSSA 

GGPGALLGPKPKLKGSLGTGAEEGAPVTAGVTA 

PGGKSRRRRTAFTSEQLLELEKEFHCKKYLSLTE 

RSQ1AHALKLSEVQVKIWFQNRRAKWKRHCAGN 

VSSRSGEPVRNPKIWPIPVHVNRFAVRSQHQQM 

EQGARP 


3053 


A 


203 

• 


2167 


FGVRVPSNTQCLVPSFHCMQTSEWDSECLTSLQP 

LPLPTPPAANEAHLQTAAISLWTWAAVQAIERK 

VEIHSRRLLHLEGRTGTAEKKLASCEKTVTELGN 

QLEGKGAVLGTLLQEYGLLQRRLENLENIXRNR 

NFWILRLPPGIKGDIPKVPVAFDDVSIYFSTPEWE 

KLEEWQKELYKNIMKGNYESLISMDYAINQPDV 

LSQIQPEGEHNTEDQAGPEESEDPTDPSEEPGISTS 

DBLSWIKQEEEPQVGAPPESKESDVYKSTYADEE 

LVIKAEGLARSSLCPEVPVPFSSPPAAAKDAFSDV 

AFKSQQSTSMTPFGRPATDLPEASEGQVTFTQLG 

SYPLPPPVGEQVFSCHHCGKNLSQDMLLTHQCS 

HATEHPLPCAQCPKHFTPQADLSSTSQDHASETP 

PTCPHCARTFTHPSRLTYHLRVHNSTERPFPCPDC 

PKRFADQARLTSHRRAHASERPFRCAQCGRSFSL 

KISLLLHQRGHAQERPFSCPQCGIDFNGHSALIRH 

QMIHTGERPYPCTDCSKSFMRKEHLLNHRRLHT 

GERPFSCPHCGKSFIRKHHLMKHQRIHTGERPYP 

CSYCGRSFRYKQTLKDHLRSGHNGGCGGDSDPS 

GQPPNPPGPLITGLETSGLGVNTEGLETNQWYGE 

GSGGGVL 


3054 


A 


3 

• 


2212 


SCGHKSAYGSYTGLQLFWEDGQELLQHQQLQD 
LRLCVHLRPQSEKVELSLWTLFWGKGEPSAVR 
EKLGKAGFAAASGPGGRPGAERASTVLNILHLT 
AESRWEPNACNRVSSSPAGVGPLDLPVGPLLYFF 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Clycine, H-Histidinc, 
I=Isoleucine, K«Lysine, L=Leudne, M=Methfcmine, 
N-Asparagine, P=ProlInc, Q^GIutamine, R-Arginine, S^-Serine, 
T=Threonine, V=»Valine, W=Tryptopban, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 


* 








APWARASFLCHAFQRPLTGIGLNTVRFTSEFPLH 

SKDPTAHKLLFTGNYLCKLHPRPRHAPQGSLSDF 

CHGTEGKDLPSEHNVSVEGVAQDRSPEATLCPQ 

KTCPCDICGLRLKDILHLAEHQTTHPRQKPFVCE 

AYVKGSEFSANLPRKQVQQNVHNPIRTEEGQAS 

PVKTCRDHTSDQLSTCREGGKDFVATAGFLQCE 

VTPSDGEPHEATEG VVDFHIALRHNKCCE SGD AF 

NNKSTLVQHQRIHSRERPYECSKCGIFFTYAADL 

TQHQKVHmGKPYECCECGKFFSQHSSLVKHRR 

VHTGESPHVCGDCGKFFSRS SNLIQHKRVHTGEK 

PYECSDCGKITSQRS^IHHKRVHTGRSAHECSE 

CGKSFNCNS SLIKHWRVHTGERPYKCNECGKFFS 

HIASLIQHQIVHTGERPHGCGECGKAFIRSSDLMK 

HQRVHTGERPYECNECGKLFSQSSSLNSHRRLHT 

GERP YQCSECGKFFNQS SSLNNHRRLHTGERPYE 

CSECGKTFRQRSNLRQHLKVHKPDRPYECSECG 

KAFNQRPTLIRHQKJHIRERSMENVLLPCSQHTPE 

ISSENRPYQGAVNYKLKLVHPSTHPGEVP 


3055 

« 


A 


268 


2954 


ARRSSSSQGSAAPTPCQWEASRDQLVAGPSGK 

MGNREMEELIPLVNRLQDAFSALGQSCLLELPQI 

AWGGQSAGKSSVLENFVGRDFLPRGSGIVTRRP 

LVLQLVTSKAEYAEFLHCKGKKFTDFDEVRLEIE 

AETDRVTGMNKGISSIPINLRVYSPHVLNLTLIDL 

PGITKVPVGDQPPDIEYQIRMIMQF1TRENCLILA 

VTPAInTTOLANSDALKLAKEVDPQGLRTIGVITKL 

DLMDEGTDARDVLENKLLPLRRGYVGWNRSQ 

KPIDGKKDKAAMLAERKFFLSHPAYRHIADRM 

GTPHLQKVLNQQLTNHTRDTLPNFRNKLQGQLLS 

BEHEVEAYKNFKPEDPTRKTKALLQMVQQFAVD 

FEKRffiGSGDQVDTLELSGGAKINRIFHERFPFEIV 

KMEFNEKELRREISYAIKNIHGIRTGLFTPDMAFE 

AIVKXQIVKLKGPSLKSVDLVIQELINTVKKCTK 

KLANFPRLCEETERIVANHIREREGKTKDQVLLLI 

DIQVS YINTNHEDFIGF ANAQQRS SQ VHKKTTVG 

NQVIRKGWLTISNIGIMKGGSKGYWFVLTAESLS 

WYKDDEEKEKKYMLPLDNLKVRDVEKSFMSSK 

HIFALFhTIEQRNVYKDYRFLELACDSQEDVDSW 

KASIXRAG VYPDKS VGNNKAENDENG QAENFS 

MDPQLERQVKl'lKNLVDSYMSIINKCIRDLIPKTr 

MHLMINNVTGDFINSELLAQLYSSEDQNTLMEES 

AEQAQRRDEMLRMYQALKEALGIIGDIGTATVS 

TPAPPPVDDSWIQHSRRSPPPSPTTQRRPTLSAPL 

ARPTSGRGPAPAIPSPGPHSGAPPVPFRPGPLPPFP 

SSSDSFGAPPQVPSRPTEtAPPSVPSRRPPPSPTRPTI 

IRPLESSLLD 


*> /*i r ^ 

3056 


A 


1674 


1839 


WRVTCCPPARSTTERTNAYDEEDCVEMVASGG 
WNDVACHTTMYFMCEFDKKNM 


3057 


A 


1674 


1839 


VVRVTCCPPARSTTERTNAYDEEDCVEMVASGG 
W^VACHTTMYFMCEFDKKNM 


3058 


A 


3363 


2525 


FLVKLILIILCRCLHSLSRSVQQLRTSFQDHAVWK 

PLMKVLQNAPDEELWASSMLCNLLLEFSPSKEPI 

LESGAVE1XCGLTQSENPALRVNGIWALMNMAF 

QAEQKIKADILRSLSTEQLFRLLSDSDLNVLMKT 

LGLLRNIXSTRPHIDKIMSTHGKQIMQAVTLILEG 

EHNIEVKEQTLCILANIADGTTAKDLIMTNDDILQ 
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NO: 


oiecnou 


Fred ic tea 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


r rente ted end 

nucleotide 

location 

corresponding 

to last amino 
| add residue of 
i peptide 

sequence 


Amino acid sequence (A= Ala nine C=Cysteine, D=Asparbc Acid, 
E=Glutamic Add, ^Phenylalanine, &=Glycine, H=>Hlstidine, 
t=»lsoleucine, K»Lysine, L^Leucinc, M=Methionine, 
N-Asparagjne, P»Proline, Q=Glutamine, R=Arglnine, S=Serine, 
T^Threonine, V»Valine, W=Tryptophnn, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possibtc nucleotide deletion, 
Wnossible nucleotide insertion 










KIKYYMGHSHVKLQLAAMFCISNLIWNEEEGSQ 
ERQDKLRDMGIVDILHKLSQSPDSNLCDKAKMA 
LQQYLA 


3059 


A 


679 


167 


SS WPSLSSQMHFPSFHLHVAAHY GRDSFVRLLLE 

FKAEVDPLSDKGTTPLQLAIIRERSSCVKILLDHN 

ANIDIQNGFLLRYAVIKSNHSYCRMFLQRGAJDTN 

LGRLEDGQTPLHLSALRDDVLCARMLYNYGAD 

TNTRNYEGQTPLAVSISISGSSRPCLDFLQEVTSM 


3060 


A 


30 


234 


PPLQLDMDPNCY CADGDSCTCAGSCKCKECKCT 

SCKKSCCSCCPAGCAKCAQGCICKGATDKCSCC 

A 


3061 


A 


428 


720 


VRRDVRQQATWAMASDLDFSPPEVPEPTFLENL 

LRYGLFLGAIFQLICVLAIIVPIPKSHEAEAEPSEPR 

SAEVTRKPKAAWSVNKRPKKETKKKR 


3062 


A 


1589 

■ 


276 


WKQKYEPLGLDAAGIEEAITAVGSFILKANELLQ 

VIDSSMKNFKAFr^WLYVAMLRMTEDHVLPELN 

KMTQKDni^AEFLTEHFNEAPDLYNRKGKYFN 

VERVGQYLKDEDDDLVSPPNTEGNQWYDFLQN 

SSHLKESPLUTYYPRKSLHFVKRRMENIIDQCLQ 

KPADVrGKSMNQAICIPLYRDTRSEDSTRRLFKFP 

FLWN>«TSNLHYLLFTILEDSLYKMC1LRRHTDIS 

QSVSNGLIAIKFGSFTYATTEKVRRSIYSCLDAQF 

YDDETVTVA^KDTVGREGRDRLLVQLPLSLVYN 

SEDSAEYQFTGTYSTRLDEQCSAIPTRTMHFEKH 

WRLLESMKAQ YV AGNGFRK VS C VLS SNLRHVR 

VFEMDEDDEWELDES SDEEEEASNKPVKIKEEVL 

SESEAENQQAGAAALAPEIVIKVEKLDPELDS 


3063 


A 


50 


849 


DKMPSIFAYQSSEVDWCESNFQYSELVAEFYNTF 

SNIPFFIFGPLMMLLMHPYAQKRSRYIYVVWVLF 

MIIGLFSMYFHMTLSFLGQLLDEIAILWLLGSGYS 

IWMPRCYFPSFLGGNRSQFIRLVFITTVVSTLLSFL 

RPTVNAYALNS1ALHILYIVCQEYRKTSNKELRH 

LIEVSVVLWAVALTSWISDRLLCSFWQRIHFFYL 

HSIWHVLISITFPYGMVTMALVDANYEMPGETL 

K VRYWPRD S WP VGLP YVEERGDDKD C 


3064 


A 


1523 


925 


AATMADGQMPFSCHYPSRLRRDPFRDSPLSSRLL 

DDGFGMDPFPDDLTASWPDWALPRLSSAWPGTL 

RSGMVPRGPTATARFGVPAEGRTPPPFPGEPWK 

VCVNVHSFKPEELMVKTKJXjYVEVSGKKEEKQ 

QEGGIVSKNFTKiaQLPAEVDPVTVFASLSPEGLL 

IIEAPQVPPYSTFGESSFNNELPQDSQEVTCT 


3065 


A 


230 


2929 


LSTSLTGSHLFSLGNHSTRJBNLNAGNFNFPSEGH 

LVRSTGPGGSFAKHMV AQCV SPKGPLACSRTYF 

FGATHVPYLGGDSKLPKKTEQIRLLSQIYAAVIE 

AVLAGIACYAKTSSLTKAKEVAEQTLGSGLDSFE 

LIPFKAALRSKMTFHIHAVNNQGRIVPLDSEDSLS 

FVKTACMAVYDIPDLLGGNGCLGSVVFSESFLTS 

QILVKEKDGTVTTETSSVVLTAAVPRFCSWLVED 

NEVKLSEKTHQAVRGDESFLGTYLTGGEGAYLY 

SSNLQSWPEEGNVHFFSSGLLFSHCRHGSniSKD 

HMNSISFYDGDSTSTVAALLIDFKSSLLPHLPVHF ! 

HGSSNFLMIALFPKSKIYQAFYSEVFSLWKQQDN 

SGISLKVIQEDGLSVEQKRLHSSAQKLFSALSQPA 

GEKRSSLKIXSAKLPELDWFLQHFAISSISQEPVM 

RTHlJVlXQQAEINTTHRIESDKVnSTVTGLPGCH 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=>Aspartic Acid, 
E=GIntamic Add, ^Phenylalanine, G=Glyc!ne, H=Histidine, 
I-Isoleuctoe, K»Lyslne, L» Leu cine, M»Methlonine, 
N=Asparagine, P«ProIlne, Q=Glutamine, R=Arginine, S="Scrine, 
T=Threonine, V»Vn1ine, W=Tryptophan, Y«Tyrosine, 
X=Unknowo, *=Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 










ASELCAFLVTLHKECGRWMVYRQIMDSSECFHA 

AHFQRYLSSALEAQQNRSARQSAYIRKKTRLLV 

VLQGYTDVIDVVQALQTHPDSNVKASFTIGAITA 

CVEPMSCYMEHRFLFPKCLDQCSQGLVSNVVFT 

SHTTCQRHPIXVQLQSLIRAANPAAAFILAENGIV 

TRNEDIELILSENSFSSPEMLRSRYLMYPGWYEG 

KLNAGSVYPLMVQICVWFGRPLEKTRFVAKCKA 

IQSSIKPSPFSGNIYHILGKVKFSDSERTMEVCYNT 

LANSLSIMPVLEGPTPPPDSKSVSQDSSGQQECYL 

VFIGCSLKEDSIKDWLRQSAKQKPQRKALKTRG 

MLTQQE1RSIHVKRHLEPLPAGYFYNGTQFVNFF 

GDKTDFHPLMDQFMNDYVEEANREIEKYNQELE 

QQEYHDLFELKP 


3066 


A 


130 

• 


588 


LAPLRCQPGTRTQPRSHPAANDPSAAMSAAGAR 

GLRATYHRLLDKVELMLPEKLRPLYNHPAGPRT 

VFFWAPIMKWGLVCAGLADMARPAEKLSTAQS 

AVLMATGFIWSRYSLVnPKNWSLFAVNFFVGAA 

GASQLFRIWRYNQELKAKAHK 


3067 


A 


2 


1016 


EFARRRVFIAAREMSLLRSLRVFLVARTGSYPAG 

SLLRQSPQPRHTFYAGPRLSASASSKELLMKLRR 

KTGYSFVNCKKALETCGGDLKQAEIWLHKEAQ 

KEG WSKAAKLQGRKTKEGLIGLLQEGNTTVL VE 

VNCETDFVSRNLKFQLLVQQVALGTMMHCQTL 

KDQPSAYSKGFLNSSELSGLPAGPDREGSLKDQL 

ALAIGKLGENMILKRAAWVKVPSGFYVGSYVHG 

AMQSPSLHKLVLGKYGALVICETSEQKTNLEDV 

GRRLGQHWGMAPLSVGSLDDEPGGEAETKML 

SQPYLLDPSITLGQYVQPQGVSWDFVRFECGEG 

EEAAETE 


3068 

• 


A 


3 


1679 


NSRVWGPWTEPSAGSLRPMARKQNRNSKELGL 

VPLTDDTSHAGPPGPGRALLECDHLRSGVPGGR 

RRKD WSCSLLVA SLAGAFGSSFLYGYNLS WNA 

PTPYIKAFYNESWERRHGRPIDPDTLTLLWSVTV 

SIFAIGGLVGTLIVKMIGKVLGRKHTLLANNGFAI 

SAALLMACSLQAGAFEMLIVGRFIMGIDGGVALS 

VLPMYLSEISPKEIRGSLGQVTAIFICIGVFTGQLL 

GLPELLGKESTWPYLFGVIWPAWQLLSLPFLP 

DSPRYLLLEKHNEARAVKAFQTFLGKADVSQEV 

EEVLAESRVQRSIRLVSVLELLRAPYVRWQVVT 

VIVTMACYQLCGLNAIWFYTNSIFGKAGIPPAKIP 

YVTLSTGGIETLAAVFSGLVIEHLGRRPLLIGGFG 

LMGLFFGTLTITLTLQDHAPWVPYLSIVGILA1IAS 

FCSGPGGIPFILTGEFFQQSQRPAAFIIAGTVNWLS 

NFA V GLLFPFIQKSLDTYCFL VFATICITGAIYLYF 

VLPETKNRTYAEISQAFSKRNKAYPPEEK1DSAV 

TDGKINGRP 


3069 


A 


861 


300 


AA GAW S AMPKAKGKTRRQKFG YS VNRKRLNR 

NARRKAAPRIECSHIRHAWDHAKSYRQNLAEMG 

LAVDPNRAW1JRJKRKVKAMEVDIEERPKELVRK 

PYVLNDLEAEASLPEKKGNTL SRDLIDYVRYMV 

ENHGEDYKAMARDEKNYYQDTPKQIRSKINVY 

KRFYPAEWQDFLDSLQKRKMEVE 


3070 


A 


325 


2019 


LAEPEVATDSGQQADLPAEGGDPRAEASCSVLH 
SKPHAMADSRDPASDQMQHWKEQRAAQKADV 
LTTGAGNPVGDKLNVTrVGPRGPLLVQDWFTD 
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Amino add sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E*=G!utamic Acid, F«Phenyla!anine, G=Ctyclne, H«Histtdine, 
I=Iso leucine, K«=»Lysine, L»Leucine, M=Methionine, 
N»Asparagfnc, P^Proline, Q=Glutamine, R=Argiiiine, S= a Scrinc, 
Tt»Threoninc, V«Valiue, W^Tryptophan, Y=Tyros!ne, 
X^Unknown, *«Stop codon, /^possible nucleotide deletion, 
V=possiWe nucleotide insertion 



SEQ ED | Method 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



EMAHFDRERIPER VVHAKG AG AFG YFE VTHDIT 

KYSKAKVFEfflGKKTPIAVRFSTVAGESGSADTV 

RDPRGFAVKJYTEDGNWDLVGNNTPIFFIRDPILF 

PSFIHSQKRNPQTHLKDPDMVWDFWSLRPESLH 

QVSFLFSDRGIPDGHRHMNGYGSHTFKLVNANG 

EAVYCKFHYKTDQGIKNLSVEDAARLSQEDPDY 

GIRDLFNAIATGKYPSWTFYIQVMTFNQAETFPF 

NPFDLTKVWPHKDYPLIPVGKLVLNRNPVNYFA 

EVEQIAFDPSNMPPGffiASPDKMLQGRLFAYPDT 

HRHRLGPNYLHIPVNCPYRARVANYQRDGPMC 

MQDNQGGAPNYYPNSFGAPEQQPSALEHS1QYS 

GEVRRFNTANDDNVTQVRAFYVNVLNEEQRKR 

LCEh^GHLKDAQIFIQKKAVlO^TEVHPDYGSH 

IQALLDKYNAEKPKNAIHTFVQSGSHLAAREKA 

NL 



3071 



1187 



SLGWLERPPALSRAAGDGARRLSGSRRGDVWLT 

SSAAGLLRSVAGGSWCGGQLRARGGSGRCVAR 

AMTGNAGEWCLMESDPGVFTELIKGFGCRGAQ 

VEEIWSLEPENFEKLKPVHGLIFLFKWQPGEEPA 

GSWQDSRLDTIFFAKQVINNACATQAIVSVLLN 

CTHQDVHLGETLSEFKEFSQSFDAAMKGLALSN 

SDVIRQVHNSFARQQMFEFDTKTSAKEEDAFHF 

VSYVPVNGRLYELDGLREGPIDLGACNQDDWIS 

AVRPVTEKRIQKYSEGEIRFNLMAIVSDRKMIYEQ 

KIAELQRQLAEEEPMDTDQGNSMLSAIQSEVAK 

NQNILffiEEVQKLKRYKIENIRRKHNYLPFIMELL 

KTLAEHQQL1PLVEKAKEKQNAKKAQETK 



3072 



103 



2775 



3073 



67 



2415 



RLRTLAPPGLLLGPPLVPDSRRRHQASLTPLfflSG 

SPQLVGRGDRKLRTEVLVPPAALPAETRQRRSER 

LPRRTCPRGGAPGPGRSRLPRSLPPPSAIPGLRSPV 

WAAGLGGGGRREPSRGKGGAALRARHRSTMAE 

LGAGGDGHRGGDGAVRSETAPDSYKVQDKKNA 

SSRPASAISGQNNNHSGNKPDPPPVLRVDDRQRL 

ARERREEREKQLAAREIVWLEREERARQHYEKH 

LEERKKRLEEQRQKJEERRRAAVEEKRRQRLEED 

KERHEAVVRRTMERSQKPKQKHNRWSWGGSLH 

GSPSIHSADPDRRSVSTMNLSKYVDPVISKRLSSS 

SATLLNSPDRARRLQLSPWESSVVNRLLTPTHSF 

LARSKSTAALSGEAVIPICPRSASCSPIIMPYKAAH 

SRNSMDRPKLF VTPPEG S SRRRHHGTA S YKKERE 

RENVLFLTSGTRRAVSPSNPKARQPARSRLWLPS 

KSLPHLPGTPRPTSSLPPGSVKAAPAQVRPPSPGN 

IRPVKREVKVEPEKKDPEKEPQKVANEPSLKGRA 

PLVKVEEATVEERTPAEPEVGPAAPAMAPAPAS 

APAPASAPAPAPVPTPAMVSAPSSTVNASASVKT 

SAGTTDPEEATRLLAEKRRLAREQREKEERERRE 

QEELERQKREELAQRVAEERTTRREEESRRLEAE 

QAREKEEQLQRQAEERALREWEEAERAQRQKEE 

EARVREEAERVRQEREKHFQREEQERLERKKRL 

EEIMKRTRRTEATDKKTSDQRNGDIAKGALTGG 

TEVSAIJPCTTNAPGNGKPVGSPHVVTSHQSKVT 

VESTPDLEKQPNENGVSVQNENFEEIINLPIGSKP 

SRLDVTNSESPEIPLNPLLAFDDEGTLGPLPQVDG 

VQTQQTAEVI 

PPRVCRDHVCLICWDPIAGTGGSRSTMPALPLDQ 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A= Ala nine C=Cystelnc, D^Aspartic Add* 
E=Glutamic Add, F=Pheny (alanine, G=Clydne, H=Hisridine, 
I=Isoleucinc, K=*Lysine, Leucine, M=Methionine, 
N°AsparagIne f P»Prollne, Q^Glntamlne, R=Arginlne, S=Serine, 
T^Threonine, V«Vallne, \V=Tryptophan f Y^Tyroslne, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 




* 




* 


LQrniKDPKTGKLRTSPALHPEQKADRYFVLYKP 

PPKDNIPALVEE VLERATFVANDLD WLLALPHD 

KFWCQVIFDETLQKCLDSYLRYVPRKFDEGVAS 

APEVVDMQKRLHRSVFLTFLRMSTHKESKDHFIS 

PSAFGEILYNNFLFDIPKILDLCVLFGKGNSPLLQ 

KMIGNIFr(^PSYYSDlJ>ETXPTELQVFSNILQHC 

GLQGDGANTTPQKT .EFRGRLTPSDMPLLELKDI V 

LYLCDTCTTLWAFLDII^LACQTFQKHDFCYRLA 

SFYEAAIPEMESADCKRRLEDSKLLGDLWQRLSH 

SRKKLMEIFHIILNQICLLPILE SSCDNIQGFEEEFL 

QIFSSLLQEKRFLRD YDALFPVAEDISLLQQ A SS V 

LDETRTAYILQAVESAWEGVDRRKATDAKDPSV 

IEEPNGEPNGVTVTAEAVSQASSHPENSEEEECM 

GAAAAVGPAMCGVELDSLISQVKDLLPDLGEGFI 

LACLEYYHYDPEQVIhn^EERLAPTLSQLDRNL 

DREMKPDPTPLLTSRHNVFQNDEFDVFSRDSVDL 

SRVHKGKSTRKEENTRSLLNDKRAVAAQRQRYE 

QYSWVEEVPLQPGESLPYHSVYYEDEYDDTYD 

GNQVGANDADSDDELISRRPFTIPQVLRTKVPRE 

GQEEDDDDEEDDADEEAPKPDHFVQDPAVLREK 

AEARRMAFLAKKGYRHDSSTAVAGSPRGHGQS 

RETTQERRKKEANKATRANHNRRTMADRKRSK 

GMIPS 


3074 


A 


3 


251 


GEARSPPPAAALLDMDPETCPCPSGGSCTCADSC 
KCEGCKCTSCKKSCCSCCPAECEKCAKDCVCKG 
GEAAEAEAEKCSCCQ 


3075 


A 


255 


982 


SQFSLSQVLVDSAEEGSLAAAAELAAQKREQRL 

RKFRELHLMR>IEARKLNHQEVVEEDKRLKLPAN 

WE AKKARLE WELKEEEKKKEC AA RGED YEKVK 

LLEISAEDAERWERKKKRKNPDLGFSDYAAAQL 

RQYHRLTKQIKPDMETYERLREKHGEEFFPTSNS 

LLHGTHVPSTEEIDRMVIDLEKQIEKRDKYSRRR 

PYNDDADIDYINERNAKFNKKAERFYGKYTAEI 

KQNLERGTAV 


3076 


A 

• 


255 


982 


SQFSLSQVLVDSAEEGSLAAAAELAAQKREQRL 

RKFRELHLMRNEARKLNHQEVVEEDKRJLKLPAN 

WEAKKARLEWELKEEEKKKECAARG ED YEKVK 

LLEISAEDAERWERKKKRKNPDLGFSDYAAAQL 

RQYHRLTKQIKPDMETYERLREKHGEEFFPTSNS 

LLHGTHVPSTEEIDRMVEDLEKQIEKRDKYSRRR 

PYNDDADIDYINERNAKFNKKAERFYGKYTAEI 

KQNLERGTAV 


3077 


A 


1 


968 


FRLRPRRACAQLLWHPAAGMASWAKGRSYLAP 

GLLQGQVAIYTGGATGIGKAIVKELLELGSNWI 

ASRKLERLKSAADELQANLPPTKQARVIPIQCNIR 

l^EE\n s ^hn^VKSTLDTFGKINFLVNNGGGQFLSPA 

EHISSKGWHAVLETNLTGTFYMCKAVYSSWMK 

KHGGSIVNIIVPTKAGFPLAVHSGAARAGVYNLT 

KSLAFEWACSGIRINCVAPGVIYSQTAVENYGSW 

GQSFFEG SFQKJPAKRIG VPEEVSS WCFLLSPAA 

SFTTGQSVDVDGGRSLYTHSYEVPDHDNWPKGA 

GDLSVVKKMKETFKEKAKL 


3078 

• 


A 


2 


3508 


FVRESGKAPVTFDDrrVYLLQEEWVLLSQQQKEL 

CGSNKLVAPLGPTVANPELFRKFGRGPEPWLGS 

VQGQRSLLEHHPGKKQMGYMGEMBVQGPTRES 
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NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A— Alanine C=Cysteine, D^Aspartk Acid, 
E^GIutarak Acid, ^Phenylalanine, G=Glycine, H»Histidine, 
l»!soleudne, K-Lyslne, L»Leucine, M^Mcthionine, 
N»Asparagine, P=Proline, Q=GIutamine, R=Arginlne t S=Scrinc, 
T=Threonine, V«Valine, W=Tryptophan, Y-Tyroslne, 
X«Unknown, *=Stop cod on, /^possible nucleotide deletion, 
^possible nucleotide insertion 




« 






GQSLPPQKKAYLSHLSTGSGHIEGDWAGRNRKL 

LKPRSIQKSWFVQFPWLIMNEEQTALFCSACREy 

PSIRDKRSRLIEGYTGPFKVETLKYHAKSKAHMF 

CVNALAARDPIWAARFRSIRDPPGDVLASPEPLF 

TADCPIFYPPGPLGGFDSMAELLPSSRAELEDPGG 

DGAIPAMYLDCISDLRQKEITDGIHSSSDINILyN 

DAVESCIQDPSAEGLSEEVPWFEELPWFEDVA 

VTrHnREEWGMLDKRQKELYRDVMRMNYELLAS 

LGPAAAKPDLISKLERKAAPWIKDPNGPKWGKG 

RPPGNKKMV A VREADTQASA ADSALLPG SPVEA 

RASCCSSS1CEEGDGPRRIKRTYRPRSIQRSWFGQ 

FPWLVIDPKETKLFCSACIERPNLHDKSSRLVRG 

YTGPFKVETLKYHEVSKAHRLCVNTVEIKEDTPH 

TALVPEISSDLMANMEHFFNAAYSIAYHSRPLND 

FEKILQLLQSTGTVILGKYRNRTA CTQFIKYISETL 

KREILEDVRNSPCVSVLLDSSTDASEQACVGIYIR 

YFKQMEVKESYITLAPLYSETADGYFEnVSALD 

E1X)IPFRKPGWVVGLGTDGSAMLSCRGGLVEKF 

QEVIPQLLPVHCVAHRLHLAWDACGSIDLVKK 

CDRHIRTVFKFYQSSNKRLNELQEGAAPLEQEITR 

LKDLNAVRWVASRRRTLHALLVSWPALARHLQ 

RVAEAGGQIGHRAKGMLKLMRGFHFVKFCHFL 

LDFLSIYRPLSEVCQKEIVLITEVNATLGRAYVAL 

ESLRHQAGPKEEEFNASFKDGRLHGICLDKLEVA 

EQRFQADRERTVLTGffiYLQQRFDADRPPQLKN 

MEVFDTMAWPSGIELASFGNDDILNLARYFECSL 

PTGYSEEALLEEWLGLKTIAQHLPFSMLCKNALA 

QHCRFPLLSKLMAWVCVPISTSCCERGFKAMN 

RIRTDERTKLSNEVLNMLMMTAVNGVAVTEYD 

PQPAIQHWYLTSSGRRFSHVYTCAQVPARSPASA j 

RLRKEEMGALYVEEPRTQKPPDLPSREAAEVLKD 

CIMEPPERLLYPHTSQEAPGMS 


3079 


A 


343 


1513 


FSPLEPRLCSLGGWGALQAGEPCQPSRAGCGRE 

GATMGCTLSAEERAALERSKAIEKNLKEDGISAA 

KDVBCLLLLGAGESGKSTTVKQMKHHEDGFSGED 

VKQYKPVVYSNTIQSLAAJVRAMDTLGIEYGDK 

ERKADAKMVCDVVSRMEDTEPFSAELLSAMMR 

LWGDSGIQECFNRSREYQLNDSAKYYLDSLDRIG 

AADYQPTEQDILRTRVKTTGIVETHFTFKNLHFR 

LFDVGGQRSERKKWIHCFEDVTAIIFCVALSGYD 

QVIJIEDETTNRMHESIJKLro 

LFLNKKDIFEEKIKKSPLTICFPEYTGPSAFTEAVA 
YIQAQYESKNKSAHKEIYSHVTCATDTNNIQFVF 
DAVTDVIIAKNLRGCGLY 


3080 


A 


41 


997 


E ARTARELTDG VTD GLTMADQPKPI SPLKKLL A 

OOr uvj V O Li Vr^GHPLDTVKVRLQTQPPSLPGQPP 

MYSGTFDCFRKTLFREGITGLYRGMAAPnGVTP 

MFAVCFFGFGLGKKLQQKHPEDVLSYPQLFAAG 

MLSGVFTTGIMTPGERIKCLLQIQASSGESKYTGT 

LDCAKKLYQEFGIRGIYKGTVLTLMRDVPASGM 

YFMTYEWLKN1FTPEGKRVSELSAPRTLVAGGIA 

GBFNWAVAIPPDVLKSRFQTAPPGKYPNGFRDVL 

RELIRDEGVTSLYKGFNAVMIRAFPANAACFLGF 

EVAMKFLNWATPNL 


3081 


A 


3 


1996 


IMADMBDLFGSDADSEAERKDSDSGSDSDSDQE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted eud 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«-A!anine OCystcine, D=Aspartic Acid, 1 
OGIutamic Add, F-Phcny lata nine, OGrycine, H=Histidlne, ! 
I-Isoleueine, K=Lysine, L-Leucine, M^Mcthiontne, j 
N-Asparagine, P«Proline, Q=G tuta mine, R»Arginine, S=Seriae, j 
T-Threonine, V=Vnline, W-Tryptophan, Y«Tyroslne, 
X=»Unknown, *=Stop codon, /^possible nucleotide deletion, 
V-possiblc nucleotide insertion 


• 






• 


NAASGSNASGSESDQDERGDSGQPSNKELFGDD 

SEDEGASHHS GSDNHSERSDNRSEASERSDHEDN 

DPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSEA 1 

EGSEKAHSDDEKWGREDKSDQSDDEKIQNSDDE 

ERAQGSDEDKLQNSDDDEKMQNTDDEERPQLS 

DDERQQLSEEEKANSDDERPVASDNDDEKQNSD 

DEEQPQLSDEEKMQNSDDERPQASDEEHRHSDD ! 

EEEQDHKSESARGSDSEDEVLRMKRKNAIASDSE 

ADSDTEVPKDNSGTMDLFGGADDISSGSDGEDK 

PPTPGQPVDENGLPQDQQEEEPIPETRIEVEIPKV | 

NTDLGNDLYFVKLPNFLSVEPRPFDPQYYEDEFE j 

DEEMLDEEGRTRLKLKVENT1RWRIRRDEEGNEI 

KESNARIVKWSDGSMSLHLGNEVFDYYKAPLQG 

DHNHLFIRQGTGLQGQAVFKTKLTFRPHSTDSAT 

rDUCMTLSLADRCSKTQKIRILPMAGRDPECQRTE 

MDCKEEERLRASIRRESQQRRMREKQHQRGLSAS 

YLEPDRYDEEEEGEESISLAAIKNRYKGGIREERA j 

RIYSSDSDEGSEEDKAQRLLKAKKLTSDEVRFNL 

FN SRG LS CTQEPTALNEELTDQ AGTN 


3082 


A 


3 


921 


VEFCLPASADSSSLVAASLAGVRKMATNFLAHE 

KIWFDKFKYDDAERRFYEQMNGPVAGASRQEN 1 

GASVILRDIARARENIQKSLAGSSGPGASSGTSGD 

HGELWRIASLEVENQSLRGWQELQQAISKLEA | 

RLNVLEKSSPGHRATAPQTQHVSPMRQVEPPAK 

KPATPAEDDEDDDIDLFGSDNEEEDKEAAQLREE 

RLRQYAEKXAKKPALVAXSSILLDVKPWDDETD 

MAQLEACVRSIQLDGLVWGASKLVPVGYGIRKL 

QIQCVVEDDKVGTDLLEEEITKFEEHVQSVDIAA 
FNKI 


3083 


A 


3 


921 


VEFCLPASADSSSLVAASLAGVRKMATNFLAHE 

KIWFDKFKYDDAERRFYEQMNGPVAGASRQEN 

GASVELRDIARARENIQKSLAGSSGPGASSGTSGD 

HGELWRIASLEVENQSLRGWQELQQAISKLEA 

RLNVLEKSSPGHRATAPQTQHVSPMRQVEPPAK 

KPATPAEDDEDDDIDLFGSDNEEEDKEAAQLREE 

RLRQ YAEKKAKKPAL VAKS SILLDVKP WDDETD 

MAQLEACVRSIQLDGLVWGASKLVPVGYGIRKL 

QXQCVVEDDKVGTDLLEEEITKFEEHVQSVDIAA 

FNKI J 


3084 

» 

• 


A 


128 


4050 


KSIVKIRKRMAAETQTLNFGPEWLRALSSGGSITS 
PPLSPALPKYKLADYRYGREEMLALFLKDNKIPS 
DLLDKEFLPILQEEPLPPLALVPFTEEEQ1WFSMS 1 
VNSAAVLRLTGRGGGGTWGAPRGRSSSRGRGR 
GRGECGFYQRSFDEVEGVFGRGGGREMHRSQS j 
WEERGDRRFEKPGRKDVGRPNFEEGGPTSVGRK 
HEF1RSESENWRIFREEQNGEDEDGGWRLAGSRR 
DGERWRPHSPDGPRSAGWREHMERRRRFEFDFR J 
DRDDERGYRRVRSGSGSDDDDRDSLPEWCLEDA 
EEEMGTFDSSGAFLSLKKVQKEPIPEEQEMDFRP 
VDEG EECSDSEG SHNEEAKEPDKTNKKEGEKTD 
RVGVEASEETPQTSSSSARPGTPSDHQSQEASQFE | 
RKDEPKTEQTEKAEEETRMENSLPAKVPSRGDE 
MVADVQQPLSQIPSDTASPLLILPPPVPNPSPTLRP 
VETPWGAPGMGSVSTEPpDEEGLKHLEQQAEK 
MVAYLQDSALDDERLASKLQEHRAKGVSIPLMH | 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=»Aspartic Acid, 
E=Gluta mic Add, F=Phenjialanine, G=Glycine, H=Histidine, 
I^lsoleudne, K«Lysine, L= Leu cine, M=Methionint, 
N^Asparagioe, P^Prolioe, Q^GIutamine, R=*Arginlnc, S=Serine, 
T»Threonine, V«Valine, W=Tryptophan, Y=TyrosJne, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 


♦ 


» 






EAMQKWYYKDPQGEIQGPFNNQEMAEWFQAG 

YFTMSLLVKRACDESFQPLGDIMKMWGRVPFSP 

GPAPPPHMGELDQERLTRQQELTALYQMQHLQY 

QQFLIQQQYAQVLAQQQKAALSSQQQQQLALLL 

QQFQTLKMRISDQN1IPSVTRSVSVPDTGSIWELQ 

PTASQPTVWEGGSVWDLPLDTTTPGPALEQLQQ 

LEKAKAAKLEQERREAEMRAKREEEERKRQEEL 

RRKQKGILRRQQEEERKJ^REEEELARRKQEEALR 

RQREQEIALRRQREEEERQQQEEALRRLEERRRE 

EEERRKQEELLRKQEEEAAKWAREEEEAQRRLE 

ENRLRMEEEAARLRHEEEERKRKELEVQRQKEL 

MRQRQQQQEALRRLQQQQQQQQLAQMKLPSSS 

TWGQQSNTTACQSQATLSLAEIQKLEEERERQLR 

EEQRRQQRELMKALQQQQQQQQQKLSGWGNV 

SKPSGTTKSLLEIQQEEARQMQKQQQQQQQHQQ 

PNRARNNTOSNLHTSIGNSVWGSINTGPPNQWA 

SDLVSSI WSNADTKN SNMGF WDD AVKE VGPRN 

STNKNK1WASLSKSVGVSNRQNKKVEEEEKLLK 

LFQGVNKAQLXjFTQWCEQMLHALNTANNLDVP 

TFVSFLKE VESP YE VHD YIRA YL GDTSEAKEF AK 

QFLERRAKQKANQQRQQQQLPQQQQQPPQQPP 

QQPQQQDSVWGMNHSTLHSVFQTNQSNNQQSN 

FEAVQSGKKKKKQKMVRADPSLLGFSVNASSER 
LNMGEIETLDDY 


3085' 


A 


128 


4050 

• 


ksivkirkrmaaetqtlnfgpewlralssggsits 

pplspalpkykladyrygreemlalflkdnkips 

dlldkeflpilqeeplpplalvpfteeeqrnfsms 

vnsaavlrltgrggggtwgaprgrsssrgrgr 

grgecgfyqrsfdevegvfgrgggremhrsqs 

weergdrrfekpgrkdvgrpnfeeggptsvgrk 

hefdtsesenwrifreeqngededggwrlagsrr 

dgerwrphspdgprsagwrehmerrrrfefdfr 

drddergyrrvrsgsgsidddrdslpewcleda 

eeemgtfdssgaflslkkvqkepepeeqemdfrp 

vdegeecsdsegshneeakepdktnkkegekto 

rvgveaseetpqtssssarpgtpsdhqsqeAsqfe 

rkdepkteqtekaeeetrmenslpakvpsrgde 

mvadvqqplsqipsdtaspllilpppvpnpsptlrp 

vetpwgapgmgsvstcpddeeglkhleqqaek 

mvaylqdsaldderlasklqehrakgvsiplmh 

eamqkwyykdpqgeiqgpfnnqemaewfqag 

yfimsixvkracdesfqplgdinikmwgrvpfsp 

gpappphmgeldqerltrqqeltalyqmqhlqy 

QQFLIQQQYAQVLAQQQKAALSSQQQQQLALLL 

QQFQTIJKMRISDQNIIPSVTRSVSVPDTGSIWELQ 

PTASQPTV WEGGS VWDLPLD rilPGPALEQLQQ 

LEKAKAAKLEQERREAEMRAKREEEERKRQEEL 

RRRQKGILRRQQEEERKRREEEELARRKQEEALR 

RQREQEIALRRQREEEERQQQEEALRRLEERRRE 

EEERRKQEELLRKQEEEAAKWAREEEEAQRRLE 

ENRLRMEEEAARLRHEEEERKRKELEVQRQKEL 

MRQRQQQQEALRRLQQQQQQQQLAQMKLPSSS 

TWGQQSNTTACQSQATLSLAEIQKLEEERERQLR 

EEQRRQQRELMKALQQQQQQQQQKLSGWGNV 

SKPSGTTKSLLEIQQEEARQMQKQQQQQQQHQQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»AIanine OCysteine, D=»As parti c Acid, 
E~GIutamic Acid, ^Phenylalanine, G==Glycine f H«Histidine, 
l»Isoleucine, K«Lysine, L»Leudne, M=Metbionine, 
N^Asparagf ne, P=ProIine, Q^Glutamine, R^Arginine, S=Serine, 
T«Threonlne, V^Vnline, W«Tryptophan, Y«Tyrosine, 
X=Un known, *=*Stop codon, /^possible nucleotide deletion, 
V=possiblc nucleotide insertion 










PNRAIWNTHSNLHTSIGNSVWGSINTGPPNQWA 

SDLVSSIWSNADTKNSNMGFWDDAVKEVGPRN 

STNKNKNNASLSKSVGVSmQNKKVEEEEKLLK 

LFQGVNKAQDGFTQWCEQMLHALNTANNLDVP 

TFVSFLKEVESPYEVHDYIRAYLGDTSEAKEFAK 

QFLERRAKQKANQQRQQQQLPQQQQQPPQQPP 

QQPQQQDSVWGMNHSTLHSVFQTNQSNNQQSN 

FEAVQSGKKKKKQKMVRADPSLLGFSVNASSER 

LNMGEIETLDDY 


3086 

• 


A 


675 


1334 


LHPAATSTAWLHVPPGLSMALSWVLTVLSLLPL 

LEAQIPLCANLVPVPITNATLDRITGKWFY1ASAF 

RNEEYNKSVQEIQATFFYFTPNKTEDTIFLREYQT 

RQDQCIYNTTYLNVQRENGTISRYVGGQEHFAH 

LLIIJUDTKTYMLAFDVNDEKNWGLSVYADKPET 

TKEQLGEFYEALDCLRIPKSDWYTDWKKDKCE 

PLEKQHEKERKQEEGES 


3087 


A 


1 


1575 


CIPVARSMATTATCTRFTDDYQLFEELGKGAFS 

VVRRCVKKTSTQEYAAKJINTKKLSARDHQKLE 

REARICRLLKHPNIVRLHDSISEEGFHYLVFDLVT 

GGELFEDIVAREYYSEADASHCmQILESVNHIHQ 

HDIVHRDLKPENLLLASKCKGAAVKLADFGLAIE 

VQGEQQAWFGFAGTPGYLSPEVLRKDPYGKPVD 

IWACGVILYILLVGYPPFWDEDQHKLYQQIKAG 

AYDFPSPEWDTVTPEAKNLINQMLTINPAKR1TA 

DQALKHPWVCQRSTVASMMHRQETVECLRKFN 

ARRKLKGAILTTMLVSRNFSAAKSLLNKKSDGG 

VKPQSNNKNSLVSPAQEPAPLQTAMEPQTTVYH 

NATDGIKGSTESChriTTEDEDLKVRKQEIIKITEQ 

LDSAINNGDFEAYTKICDPGLTSFEPEALGNLVEG 

MDFHKFYFENLLSKNSKPIHTnLNPHVHVIGED 

AACIAYIRJLTQYIDGQGRPRTSQSEETRVWHRRD ! 

GK WLNVHYHCS G APAAPLQ 


3088 


A 


12 


1039 


SSVAEFPERVQLSQPQNWNFSGAGGAWSLDFAE 

QLKWSAELARLGESIMDGKQGGMDGSKPAGPR 

DFPGIRLLSNPLMGDAVSDWSPMHEAAIHGHQL 

SLRNLISQGWAVNnTADHVSPLHEACLGGHLSC 

VKILUCHGAQVNGVTADWHTPLFNACVSGSWD 

CVNLLLQHGASVQPESDLASPIHEAARRGHVEC 

VNSLIAYGGNEDHKISHLGTPLYLACENQQRACV 

KKLLESGADVNQGKGQDSPLHAVARTASEELAC 

LLMDFGADTQAKNAEGKRPVELVPPESPLAQLF 

LEREGPPSLMQLCRLIURKCFGIQQHHKITKLVLP 

EDLKQFLLHL 


3089 


A 


73 


432 


DMAGLMTTVTSLLFLGVCAHHIIPTGSVVLPSPCC 
MFWSKJUPEmWSYQLSSRSTClJCAGVlPTnCK 
GQQFCGDPKQEWVQRYMKNLDAKQKKASPRA 
RAVAVKGPVQRYPGNQTTC 


3090 


A 


4627 


611 


LMEAGG GGGALP AG VETMVLTLGESWP VL V GR 

RFLSLSAADGSDGSHDSWDVERVAEWPWLSGTI 

RAVSHTDVTKKJDLKVCV HEDGES WRKRRWIEV 

YSLLRRAFLVEHNLVLAERKSPEISERIVQWPAIT 

YKPLLDKAGLGSITSVRFLGDQQRVFLSKDLLKP 

IQDVNSLRLSLTDNQIVSKEFQALIVKHLDESHLL 

KGDKNLVGSEVKIYSLDPSTQWFSATVVNGNPA 

SKTXQVNCEEIPALKIVDPSLIHVEVVHDNLVTC 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
locution 
corresponding 
to last amino 
aeid residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, D=Aspartic Add, 
E«=Glutaraic Acid, F==Phenyl alanine, G=Glycinc, H-Hisndine, 
l^lsoleucioe, K=Lysine, L^Leucine, M=Mcthtoninc, 
N=Asparagine, P*=Proliae f Q=Glutaraine, R=Arginine, S«Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop.eodon,/=possible nucleotide deletion, 
V=possible nucleotide insertion 


■ 








GNSARIGAVKRKSSENNGTLVSKQAKSCSEASPS 

MCPVQSVPTTVFKEILLGCTAATPPSKDPRQQST 

PQAANSPPNLGAKIPQGCHKQSLPEEISSCLNTKS 

EALRTKPDVCKAGLLSKSSQIGTGDLKILTEPKGS 

CTQPKThTTDQENRJLESVPQALTGLPKECXPTKAS 

SKAELEIANPPELQKHLEHAPSPSDVSNAPEVKA 

GVNSDSPNNCSGKKVEPSALACRSQNLKESSVK 

VDNESCC SRSNNKJQNAP SRKS VLTOPAKJLKKLQ 

QSGEAFVQDDSCVMVAQLPKCRECRLDSLRKD 

KEQQKDSPWCRFFHFRJU-QFNKHGVLRVEGFLT 

PhOCYDNEAIGLWLPLTKNVVGIDLDTAKYILANI I 

GDHFCQMVISEKEAMSTIEPHRQVAWKRAVKG 

VREMCD VCD 1 " 1 1FNLH WVCPRCGFG VCVDC YR 

MKRKNCQQGAAYKTTSW1JCCVKSQIHEPENLM 

PTQIIPGKALYDVGDIVHSVRAKWGIKANCPCSN 

RQFKLFSKPASKEDLKQTSLAGEKPTLGAVLQQ 

NPSVLEPAAVGGEAASKPAGSMKPACPASTSPLN 

WLADLTSGNVNKENKEKQPTMPILJCNEIKCLPPL 

PPI^KSSTVLHTFNSTILTPVSNNNSGFLRNLLNSS 

TGKTENGLKNTPKILDDIFASLVQNKTTSDLSKR 

P^LTKPSILGTOTPHYWLCDNRLLCLQDPNNK 

SNWN VFREC WKQG QP VMVS G VHHKLNSEL WK 

PESFRKEFGEQEVDLVNCRTNEIITGATVGDFWD 

GFEDWhHUJCNEKEPMVIJCLKDWPPGEDFRDM 

MPSRFDDLMANIPLPEYTRRDGKLNLASRLPNYF 

VRPDLGPKMYNAYGLITPEDRKYGTTNLHLDVS 

DAANVMVYVGIPKGQCEQEEE VLKTTQDGD SDE 

LTIKRFIEGKEKPGALWHIYAAKDTEKIREFLKK 

VSEEQGQENPADHDPIHDQSWYLDRSLRKRLHQ 

EYGVQGWATVQFLGDVVFIPAGAPHQVHNLYSC 

IKVAEDFVSPEHVKHCFWLTQEFRYLSQTHTNHE 

DKLQVKNVIYHAVKDAVAMLKASESSFGKP 


3091. 


A 


97 


1838 


KRG ARRG G WKRKMP STD LLMLKAFEP YLEILE V I 

YSTKAKhTVVNGHCTKYEPWQLIAWSVVWTLLI 

VWGYEFVFQPESLWSRFKKKCFKLTRKMP1IGRK 

IQDKLNKTKDDISKNMSFLKVDKEYVKALPSQG 

LSSSAVLEKLKEYSSMDAFWQEGRASGTVYSGE 

EKLTELLVKAYGDFAWSNPLHPDIFPGLRKIEAEI 

VRIACSLFNGGPDSCGCVTSGGTESILMACKAYR 

DLAFEKGDCTPEIVAPQSAHAAFNKAASYFGMK1 

VR VPLTKMME V D VRA MRRA ISRNTAML VC STP 

QFPHGVIDPVPEVAKLAVKYKIPLHVDACLGGFL 
rVFMEKAGYPLEHPFDFRVKGVTSISADTHKYGY 
APKGS SLVLYSDKKYRNYQFFVDTD WQGGIYAS 
PTIAGSRPGGISAACWAALMHFGENGYVEATKQI 
1KTARFLKSELENIKG1FVFGNPQLSVTALGSRDFD ! 
IYRLSNLMTAKG WNLNQLQFPP SIHFCrTLLHAR 
KRVAIQFLKDIRESVTQIMKNPKAKTTGMGAIYG 
MAQTTVDRNMGAELSSVFLDSLYSTDTVTQGSQ 
MNGSPKPH 


3092 


A 


79 


2652 


LCSQNSPEDWVNFSSEKQKRYPWYWTGRKLRSE 

RAMKIQKKLTGCSRLMLLCLSLELLLEAGAGNIH 

YSVPEETDKGSFVGNIAKDLGLQPQELADGGVRJ 

VSRGRMPLFALNPRSGSLITARRIDREELCAQSM 

rcLVSF>nLVEDKMKIJTVEVEIIDINDNTPQFQL 
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seq n> 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A- Alanine OCysteine, D°Aspartic Acid, 
E*=Glutamic Acid, ^Phenylalanine, G=Ctyctne, H»Histidine, 
Islsoleucine, K»Lysine, L=L«ucine, M^Mcthionine, 
N=Asparogine, P*=Proline, Q=Glutamine, R-Arginine, S-Scrine, 
T«Threoninc, V=VaIine, W»Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon, /= possible nucleotide deletion, 
\=possiWe nucleotide insertion 










EELEFKMNHlTi'PGTRVSLPFGQDLDVGMNSLQS 

YQLSSNPHFSLDVQQGADGPQHPEMVLQSPLDR 

EEEAVHHLILTASDGGEPVRSGTLRIYIQVVDAN 

DNPPAFTQAQYHINVPENVPLGTQLLMVNATDP 

DEGANGEVTYSFHNVDHRVAQIFRLDSYTGEISN 

KEPLDFEEYKMYSMEVQAQDGAGLMAKVKVLI 

KVLDVM^NAPEVTITSVTTAVPENFPPGTIIALISV 

HDQDSGDNGYTTCFIPGNLPFKLEKXVDNYYRL 

VTERTLDRELISGYNITITAIDQGTPALSTETraSL 

LVTDINDNSPVFHQDSYSAYIPENNPRGASIFSVR 

AHDLDSNENAQITYSLIEDTIQGAPLSAYLSINSD 

TGVLYALRSFDYEQFRDMQLKVMARDSGDPPLS 

SNVSLSLFLLDQNDNAPEIL YPALPTDG STG VEL 

APRSAEPGYLVTKVVAVDRDSGQNAWLSYRLL 

KASEPGLFSVGLHTGEVRTARALLDRDALKQSL 

VVAVQDHGQPPLSATVTLTVAVADRIPDILADLG 

SLEPSAKPNDSDLTLYLVVAEAAVSCVFLAFVIV 

LLAHRLRRWHKSRLLQASGGGLASTPGSHFVGV 

DO VRAFLQTYSHEVSLTADSRKSHLIFPQPNYAD 

TLISQESCEKKGFLSAPQSLLEDKKEPFSQVNFCD 

ECISYLEKNNS 


3093 


A 


1 


3868 


PPDNQKLGLLEALLKIGDWQHAQNIMDQMPPYY 

AASHKLIAL A ICKLMTIEPLYRS VTS WA VDHAG 

FLESDPCDSTVGHLLSRVGVPKGAKGSPVNALQ 

NKRAPKQAESFEDLRRDVFNMFCYLGPHLSHDPI 

LFAKWRIGKSFMKEFQSDGSKQEDKEKTEVILS 

CLLSITDQVLLPSLSLMDCNACMSEELWGMFKT 

FPYQHRYRJLYGQWKNETYNSHPLLVKVKAQTID 

RAKYIMKRLTKENVKPSGRQIGKLSHSNPTJDLFD 

YVCFEILSQIQKYDNLITPWDSLKYLTSLNYDVL 

ACILSNCnEALANPEKERMKHDDTTTSSWLQSLA 

SFCGAVFRKYPIDLAGLLQYVANQLKAGKSFDL 

LILKEVVQKMAGIEITEEMTMEQLEAMTGGEQL 

KAEGGYFGQIRlsrrKKSSQRLKDALLDHDLALPL 

CLLMAC^RNGVIFQEGGEKHLKLVGKLYDQCH 

DTLVQFGGFLA SNLSTEDYIKRVPSIDVLCNEFHT 

PHDAAFFLSRPMYAHHISSKYDELKKSEKGSKQ 

QHKVHKYITSCEMVMAPVHEAVVSLHVSKVWD 

DISPQFYATFWSLTMYD1JVVPHTSYEREVNKLK 

VQMKAIDDNQEMPPNKKKKEKERCTALQDKLL 

EEEKKQMEHVQRVLQRLKLEKDNWLLAKSTKN 

ETITKFLQLCIFPRCIFSAIDAVYCARFVELVHQQ 

KTPNFSTLLCYDRVFSDIIYTVASCTENEASRYGR 

FLCCMLETVTRWHSDRATYEKECGNYPGFLTIL 

RATGFDGGhaCADQIJDYENFRHVVHKWHYKLT 

KASVHCLETGEYTHIRNILrV^TKILPWYPKVLNL 

GQALERRVHKICQEEKEKRPDLYALAMGYSGQL 

KSRKSYMIPENEFHHKDPPPRNAVASVQNGPGG 

GPSSS SIGSASKSDESSTEETDKSRERSQCGVKAV 

NKASSTTTKGNSSNGNSGSNSNKAVKENDKEKG 

KEKEKEKKEKTPATTPEARVLGKDGKEKPKEER 

PNKDEKARETKERTPKSDKBKEKFKKEEKAKDE 

KFKTTVPN AE SKSTQEREREKEPSRERDIAKEMK 

SKENVKGGEKTPVSGSLKSPVPRSDIPEPEREQKR 

RKIDTHPSPSHSSTVKDSLIELKESSAKLY1NHTPP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»Alonine OCysteine, D=Aspartic Acid, 
t>Glutamic Add, ^Phenylalanine, G=Glyeine» H-Histidine, 
lalsoleucine, K-Lyslne, L»Leucine, M-Mctbionine, 
N=Asparagine, P=Proline, Q^GIutamine, R-Argininc, S=Serine» 
T»Threonine, V=Valine, W=»Tryptophan, Y-Tyrosine, 
X-Unknown, *=Stop codon, /=possible nucleotide deletion, 
^possible nucleotide insertion 










PLSKSKEREMDKKDLDKSRERSREREKKDEKDR 

KERKRDHShmDREWPDLTKRRKJEENGTMGVSK 

HKSESPCESPYPNEKDKEKNKSKSSGKEKGSDSF 

KSEKMDKISSGGKKESRHDKEKIEKKEKRDSSGG 

KEEKKHHKSSDKHR 


3094 


A 

• 


2 


891 


AMLGTREPSRRGAGAVQAEVSERLAMAGPQQQ 
PPYLHLAELTASQFLEIWKHFDADGNGYIEGKEL 
ENFFQELEKARKG SGMMSKSDNFGEKMKEFMQ 

KYDKNSDGKJEMAELAQILPTEENFLLCFRQHVG 

SSAEFMEAWRKYDTDRSGYIEANELKGFLSDLL 

KKANRPYDEPKLQEYTQTILRMFDLNGDGKLGL 

SEMSRLLPVQENFLLKFQGMKLTSEEFNAIFTFY j 

DKDRSGYIDEHELDALLKDLYEKNKKEMNIQQL 

T>HiiaCSVMSIAEAGKLyRja>LEIVU:SEPPM 


3095 


A 


1685 


700 


RRPTGRPGALGAPAAGRVGMPLHV K WPFPA VPP 

LTWTL AS S VVMGL VGTYS CF WTK YMNHLTVHN 

REVLYELIEKRGP ATPLITVSNHQ SCMDDPHL WG 

ILKLRHIWNLKLMRWTPAAADICFTKELHSHFFS 

LGKCVPVCRGAEFFQAENEGKGVLDTGRHMPG 

AGKRREKGDGVYQKGMDFILEKLNHGDWVH1F 

PEGKVNMSSEFLRFKWGIGRJL1AECHLNPIILPLW 

HVGMNDVLPNSPPYFPRFGQKITVLIGKPFSALP 

VLERLRAENKSAVEMRKALTDFIQEEFQHLKTQ 

AEQLHNHLQAWEIGLACCLLDSWPAQSWG 


3096 


A 


6642 


4022 


FVPGLREPQWEPAQPSATMSAPSEEEEYARLVM ! 

EAQPE WLRAEVKRLSHEL A ETTREKIQAAE YGL 

AVLEEKHQLKLQFEELEVDYEAIRSEMEQLKEAF 

GQAHTNHKKVAADGESREESLIQESASKEQYYV 

RKVLELQTELKQLRNVLTNTQSENERLASVAQE 1 

LKEINQNVEIQRGRLRDD1KEYKFREARLLQDYS 

ELEEENISLQKQVSVLRQNQVEFEGLKHEIKRLE 

EETEYLNSQLEDAIRLKEISERQLEEALETLKTER 

EQKNSLRKELSHYMSINDSFYTSHLHVSLDGLKF 

SDDAAEPNNDAEALVNGFEHGGLAKLPLDNKTS 

TPKKEGLAPPSPSLVSDLLSELNISEIQKLKQQLM 

QMEREKAGLLATLQDTQKQLEHTRGSLSEQQEK 

VTRLTENLSALRRLQASKERQTALDNEKDRDSH 

EDGDYYEVDINGPEILACKYHVAVAEAGELREQ 

LKALRSTHEAREAQHAEEKGRYEAEGQALTEKV 

SLLEKASRQDRELLARLEKELKKVSDVAGETQG 

SL S V AQDELVTFSEEL ANL YHHVCMCNNETPNR 

VMLDYYREGQG GAGRTSPGGRTSPEARGRRSPI 

LLPKGLLAPEAGRADGGTGDSSPSPGSSLPSPLSD 

PRREPMNIYNLIAIIRDQIKHLQAAVDR1TELSRQ 

RIASQELGPAVDKDKEALMEEBLKLKSLLSTKRE 

QITTLRTVLKANKQTAEVALANLKSKYENEKAM 

VTETMMKLRNELKALKEDAATFSSLRAMFATRC 

DEYTTQLDEMQRQLAAAEDEKKTLNSLLRMAIQ 

QKLALTQRLELLELDHEQTRRGRAKAAPKTKPA 

TPSVSHTCACASDRAEGTGLANQVFCSEKHSIYC 

D 


3097 


A 


1 

V 


879 


MVKVVPATRGNLPRSQLTGTHQHCQPREPKTTA 
SERLRRRPRATARJLRAHAAPPEPPLAVFAPPSDR 
KF.T .1 *ALPVACDPVIASVMSWVQAASLIQGPGDK 
GDVFDEEADESLLAQREWQSNMQRRVKEGYRD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A^Alanine C=€ystdne, D=Aspartic Add, 
OGIotamlc Add, F-Phenylalanlne, G«Glydne, H«HistjdIne, 
I=Isoleudne, K*=Lysine, L^Leucine, M=Methionine, 
N*»Asparogine, P»Proline, Q^GIutaraioe, R^Arginioe, S^Seriae, 
T=»Threoninc, V»Va!ine, W«Tryptophan, Y«Tyrosine, 
X«Unknown, *«Stop codon, ^possible nucleotide ddetion, 
^possible nudeotide insertion 










GIDAGKAVTLQQGFNQGYKKGAEV1LNYGRLRG 
TLSALLSWCHLHNhWSTLINKINNLLDAVGQCEE 
YVLKHLKSITPPSHWDLLDSIEDMDLCHVVPAE 
KKIDEAKDERLCENNAEFNKNCSKSHSGIDCSYV 
ECCRTQEHAHSGKPKPHMDFGTDSQF 


3098 


A 


2 


505 


GAAT1XRSASSAARKAAEAEQVWLHLHRYLSA 

DRRVLGLREWGRPASERECSLCQRLKRELNMGD 

VEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGL 

FGRKTGQAPGYSYTAANKNKGIIWGEDTLMEYL 

ENPKXYffGTKMlFVGIKXKEERADLIAYLKKAT 

NE 


3099 


A 

• 


144 


1386 


WAVGQARSFPSHPRMSSWIWSRRWSPSVALRVT 

CTSTSSQRWTVLALSKPGSQQQVSMHTPAPGPPT 

AGHTEPPSEPPRRARVAKYRAKFDPRVTAKYD1K 

ALIGRGSFSRWRVEHRATRQPYADCMIETKYRE 

GREVCESELRVLRRVRHANIIQLVEVFETQERVY 

MVMELATGGELFDRHAKG SFTERD ATRVLQMV 

LDGVRYLHALGITHRDLKPENLLYYHPGTDSKIII 

TDFGLASARKKGDDCLMKTTCGTPEYIAPEVLV 

RKPYTNSVDMWALGVIAYILLSGTMPFEDDNRT 

RLYRQILRGKYSYSGEPWPSVSNLAKDFIDRLLT 

VDPGARMTALQALRHPWVVSMAASSSMKNLHR 

SISQNLLKRASSRCQSTKS AQSTRS SRSTRSNKSR 
RVRERELREL 


3100 


A 

* 


3 


1500 


ARWNGRWVQVPAWPGPGCGTNASGERQRQLPR 

AWRPVGRTLGSEPIALAWSPPLYLFPIPLPSWAVS 

QPTPTLGTMFADLDYDIEEDKLG1PTVPGKVTLQ 

KDAQNLIGISIGGGAQYCPCLYIVQVFDNTPAAL 

DGTVAAGDEITGVNGRSIKGKTKVEVAKMIQEV 

KGEVTIHYNKLQADPKQGMSLDIVLKKVKHRLV 

ENMSSGTADALGLSRAILCNDGLVKRLEELERTA 

ELYKGMTEHTKNLLRAFYELSQTHRGNGIPQSC 

AFGDVFSVIGVREPQPAASEAFVKFADAHRSIEK 

FGIRLLKTTKPMLTDLNTYL>nCAIPDTRLTIKKYL 

DVKFEYLSYCLKVKEMDDEEYSCIALGEPLYRV 

STGNYEYRULRCRQEARARFSQMRKDVLEKME 

LLDQKHVQDIVFQLQRLVSTMSKYYNDCYAVLR 

DADVFPIEVDLAHTTLAYGLNQEEFTDGEEEEEE 

EDTAAGEPSRDTRGAAGPLDKGGSWCDS 


3101 


A 


1173 


197 


QGMDSKQQCVKLNDGHFMPVLGFGTYAPPEVP 

RSKALEVTKlAIEAGFRHroSAHLYNNEEQVGLA 

mSKIADGSVKREDIFYTSKLWSTFHRPELVRPAL 

ENSLKKAQLDYVDLYLIHSPMSLKPGEELSPTDE 

NGKVIFDIVDLCri'WEAMEKCKDAGLAKSIGVS 

NFKRRQLEMILNKPGLKYKP VCNQ VE CHP YFNR 

SKLLDFCKSKDIVLVAYSALGSQRDKRWVDPNS 

PVLLEDPVLCALAKKHKRTPALIALRYQLQRGV 

WLAKSYNEQRIRQNVQVFEFQLTAEDMKAIDG 

LDRNLHYFNSDSFASHPNYPYSDEY 


3102 


A 


144 


1098 


EQPRPPPCGRRPLPLGSAPCRVRLGRAPRQAPAM 

SMLPSFGFTQEQVACVCEVLQQGGNLERLGRFL 

WSLPAOTHLHKNESVLKAKAVVAFHRGNFREL 

YKILESHQFSPHNHPKLQQLWLKAHYVEAEKLR 

GRPLGAVGKYRVRQKFPLPRTIWDGEETSYCFK 

EKSRGVLREWYAHNPYPSPREKRELAEATGLTT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanlne OCysteine, D=Aspartic Add, || 
EXxlutamic Acid, ^Phenylalanine, G=Glvcine. H=Histidine i 
I=Isoleucine, K=Lysine, L» Leu cine, M^Methionlne, 
N«*Asparagtne, P=Proline, Q=Glutnmine, R^Arginine, S=Serine, j 
T^Thrconlne, V=Valine, \V«Tryptophan, Y=Tyrosine, j 
X<=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V*possiblc nucleotide insertion 










TQVSNWFKNRRQRDRAAEAKERENTENNNSSSN 
KQNQLSPLEGGKPLMSSSEEEFSPPQSPDQNSVLL 
LQGNMGHARSSNYSLPGLTASQPSHGLQTHQHQ 
LQDSLLGPLTSSLVDLGS | 


3103 


A 


ill 


1582 


LVYSWGCHIMADNDTDRNQTEKLLKRVRELEQ 

EVQRLKKEQAKNKEDSNIRENSSGAGKTKRAFD f 

FSAHGRRHVALRIAYMGWGYQGFASQENTNNTI j 

EEKLFE ALTKTRL VE SRQTSNYHRCGRTDKG V S 

AFGQVISLDLRSQFPRGRDSEDFNVKEEANAAAE 

EIRYTHILNRVLPPDIRILAWAPVEPSFSARFSCLE 

RTYRYFFPRADLDIVTMDYAAQKYVGTHDFRNL 

CKMDVANGVINFQRTILSAQVQLVGQSPGEGRW 

QEPFQLCQFEVTGQAFLYHQVRCMMAILFLIGQ 

GMEKPEID)ELLNIEKNPQKPQYSMAVEFPLVLY 

DCXrahTS^WIYIXJEAQEFNITHLQQLWANrlAV I 

KTHMLYSMLQGLDTVPWCGIGPKMDGMTEWG 

KVKPSVIKQTSAFVEGVKMRTYKPLMDRPKCQG 

LESR1QHFVRRGRIEHPHLFHEEETKAKRJDCNDT 

LEEDNTNLETPTKRVCVDTEDCSU J 


3104 


A 


227 


1519 


VTLIKMNAMLETPELPAVFDGVKLAAVAAVLYV 

IVRCLNLKSPTAPPDLYFQDSGLSRFLLKSCPLLT 

KEYIPPLIWGKSGHIQTALYGKMGRVRSPHPYGH 

RKnTMSDGATSTFT)LFEPLAEHCVGDDITMVICP 

GIANHSEKQYIRTFVDYAQKNGYRCAVLNHLGA 

LPNffiLTSPRMK 1 YGCTWEFGAMVNYTKKTYPLT 

QLVWGFSLGGNIVCKYLGETQANQEKVLCCVS 

VCQGYSALRAQETFMQWDQCRRFYNFLMADN 

MKKIILSHRQALFGDHVKKPQSLEDTDLSRLYTA | 

TSLMQIDDNVMRKFHGYNSLKEYYEEESCMRYL 

HRIYVPLMLVNAADDPLVHESLLTTPKSLSEKRE J 

NVMFVLPLHGGHLGFFEGSVLFPEPLTWMDKLV 

VEYANAICQWERNKLQCSDTEQVEADLE 


3105 

• 


A 


1 


1251 


MGLLLMILASAVLGSFLTLLAQFFLLYRRQPEPP 
ADEAARAGEGFRYIKPVPGLLLREYLYGGGRDE j 
EPSGAAPEGGATPTAAPETPAPPTRETCYFLNAT1 
LFLFRELRDTALTRRWVTKKIKVEFEELLQTKTA 
GRLLEGLSLRDVFLGETVPFIKHRLVRPWPSAT 
GEPDGPEGEALPAACPEELAFEAEVEYNGGFHLA 
IDVDLVFGKSAYLFVKLSRWGRLRLVFTRVPFT j 
HWFFSFVEDPLIDFEVRSQFEGRPMPQLTSITVNQ 
LKKIIKRKHTLPNYKIRFKPFFPYQTLQGFEEDEE I 
HMQQWALTEGRLKVTLLECSRLLIFGSYDREA i 
NVHCTLELSSSVWEEKQRSSIKTGTISLTAVFMG 
WHRVSEAFPGLWYKLLVDLPFWGLEDGGPLLT 
VPLRQCPG 1 


3106 


A 


972 


468 


MAAAGAGRLRRVASALLLRSPRLPARELSAPAR 

LYHKKVVDHYENPRNVGSLDKTSKNVGTGLVG 

APACGDVMKLQIQVDEKGKIVDARFKTFGCGSA 

1ASSSLATEWVKGKTVEEAXTIKNTDIAKELCLPP | 

VKLHCSMLAEDAIKAALADYKLKQEPKKGEAE 

KK 


3107 


A 


106 


1221 


TCQDVRS WSLVRANIFGEESTAGAGWHREEDM 
RKELQLSLSVTLLLVCGFLYQFTLKSSCLFCLPSF 
KSHQGLEALLSHRRGIVFLETSERMEPPHLVSCS 
VESAAKJYPEWPVVFFMKGLTDSTPMPSNSTYPA | 
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SEQIB 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

-* . _ A ■ ill ■ n 

to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A*>Alanine O Cysteine, D^Aspartic Add, 
E^GIutamic Add, F»Pbenylatanine, G=Glyclne, H«Histidine, 
I=Isoleudne, K=Lysine, L^Leudne, M-Methionine, 
N=Asparagine, poproline, Q=Gluta mine, R^Arginine, S«Serine, 
T«=Threonine, V= Valine, W»Tryptophan, Y^Tyrosine, 
X«Unknown, *=Stop codon, /^possible nudeotide deletion, ] 
V=possible nudeotide insertion 










FSFLSAIDNVFLFPLDMKRIXEDTPLFSWYNQINA 
SAERNWLHISSDASRLAIIWKYGGIYMDTDVISIR j 
PIPEENFLAAQASRYSSNGIFGFLPHHPFLWECME 
NFVEHYNSAIWGNQGPELMTRMLRVWCKLEDF ! 
QEVSDLRCLNISFLHPQRFYPISYREWRRYYEVW 
DTEPSFNVSYALHLWNHMNQEGRAVIRGSNTLV 
ENLYRKHCPRTYRDLDCGPEGSVTGELGPGNK [ 


3108 


A 


1612 


839 

• 


EVALFCFEMAAGMYLEHYLDSIENLPFELQRNFQ 
LMRDLDQRTEDLKAEIDKLATEYMSSARSLSSEE 
KLALLKQIQEAYGKCKEFGDDKVQLAMQTYEM 
VDKHIRRLDTDLARFEADLKEKQIESSD YDS SSS 
KGKKKGRTQKEKKAARARSKGKNSDEEAPKTA 1 
QKKLKLVRTSPEYGMPSVTFGSVHPSDVLDMPV 
DPNEPTYCLCHQVSYGEMIGCDNPDCSffiWFHFA 
C VGLTTKPRGK WFCPRCS QERKKK 


3109 


A 

- 


1 


2613 


MVAVRAAGPREGASQDEAGTVWAPMTGCPCQC 

RPGPSWLLVDTLEPETAYPVQRPGPEQAGNQRL . 

QMKRA QFGPHD WLS LP VPPGPS WLL VDTLEPET 

AYQFSVLAQNKLGTSAFSEVVTVhTTLAFPITTPEP 

LVL\nTPRCLIANRTQQGVLLSWLPPANHSFPIDR 

YIMEFRVAERWELLDDGIPGTEGEFFAKDLSQDT 

WYEFRVLAVMQDLISEPSNIAGVSSTDIFPQPDLT 

EDGLARPVLAGIVATICFLAAADLFSTLAACFVNK I 

QRKRKLKRKKDPPLSITHCRKSLESPLSSGKVSPE 

SIRTLRAPSESSDDQGQPAAKRMLSPTREKELSL 

YKKTKRAISSKKYSVAKAEAEAEATTPIELISRGP 

DGRFVMDPAEMEPSLKSRRIEGFPFAEETDMYPE 

FRQSDEENEDPLVPTSVAALKSQLTPLSSSQESYL 

PPPAYSPRFQPRGLEGPGGLEGRLQATGQARPPA 

PRPFHHGQYYGYLSSSSPGEVEPPPFYVPEVGSPL 

SSVMSSPPLPTEGPFGHPTIPEENGENASNSTJLPLT 

QTPTGGRSPEPWGRPEFPFGGLETPAMMFPHQLP 

PCDVPESLQPKAGLPRGLPPTSLQVPAAYPGELSL 

EAPKGWAGKSPGRGPVPAPPAAKWQDRPMQPL 

VSQGQLRHTSQGMGIPVLPYPEPAEPGAHGGPST 

FGLDTRWYEPQPRPRPSPRQARRAEPSLHQWLQ II 

PSRLSPLTQSPLSSRTGSPELAARARPRPGLLQQA 

EMSEITLQPPAAVSFSRKSTPSTGSPSQSSRSGSPS 

YRPAMGFTTLATGYPSPPPGPAPAGPGDSLDVFG 

QTPSPRRTGEELLRPETPPPTLPTLGKLRRDRPAP 

ATSPPERALSKL j 


3110 


A 


88 


924 


ILGSRTMSLTNTKTGFSVKDILDLPDTNDEEGSV ] 

AEGPEEENEGPEPAKRAGPLGQGALDAVQSLPL 

KOTFYDSSDNPYTRWLASTEGLQYSLHGLAAGA 

PPQDSSSKSPEPSADESPDNDKETPGGGGDAGKK 

RKRRVLFSKAQTYELERRFRQQRYLSAPEREHLA | 

SLIRLTPTQVKIWFQNHRYKMKRARAEKGMEVT 

PLPSPRRVAVPVLVRDGKPCHALKAQDLAAATF 

QAGIPFSAYSAQSLQHMQYNAQYSSASTPQYPT 

AHPLVQAQQWTW 


3111 


A 


595 


291 


PSVASLARRFSGRALWPPSHSVPGNRALCPRLLH"! 
GTTLPG GNQREL ARQKNMKKQSD S VKGKRRDD 

GLSAAARKQRDSTPRDSEIMQQKQKKANEKKEE 
PK | 


3112 


A 


3641 


1555 


APMLQIHHFSFKLIFQNIHKSKFISQRLSQNADST J 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Orst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
odd residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine O Cysteine, D=Aspartfc Acid, [ 
E"Glutamic Add, F=Fhenylalanine, G-Glycine, H=Histidinc, 
l a IsoIeudne, K=Lysine, L^Leucine, M»Methionine, 
N=Ajparogine T P»ProIine, Q=Glutamine, R=Arginine, S-Serine, 
T»Threoni ne, V«»Vali nc, W«-Tryptophan, Y=Tyrosine, 
X«*Unknown, ***Stop codon, ^possible nucleotide deletion, 
V=possibte nudeotide insertion j 


• 








RHTM.SNTHYSDLIVWNCCLFFRNWCNEFFLKS 

CHFAQEREGSGDLCNSRAEKTKSAACVIFRKFPV 

APLIPYPLITKEDINA1EMEEDKRDL1SREISKFRPT 

HKKLEEEKGKKEKERQEIEKERRERERERERERE j 

RREREREREREREREKEKXRERERERDRDRDRTK 

ERDRDRDRERDRDRDRERSSDRNKDRSRSREKS | 

RDRERERERERERERERERERERERERERERERE 

REREKDKKRDREEDEEDAYERRKJLERKLREKEA 

AYQERIJChTWEIRERKKTREYEKEAEREEERRRE 

MAKEAKRLKEFLEDYDDDRDDPKYYRGSALQK 

RLRDREKEMEADERDRKREKEELEE1RQRLLAE 

GHPDPDAELQRMEQEAERRRQPQIKQEPESEEEE 

EEKQEKEEKREEPMEEEEEPEQKPCLKPTLRPIS S 

APSVSSASGNATPNTPGDESPCGHIPHENSPDQQ 

QPEEHRPKIGLSLKLGASNSPGQPNSVKRKKLPV 

DSVI^^IKFEDEDSDDVPRKRKLVPLDYGEDDKNA 

TKGTVNTEEKRKHIKSLIEKIPTAKPELFAYPLDW 

SIVDSDLMERRIRPWINKKnEYIGEEEATLVDLVC 

SKVMAHSPPQS1LDDVAMVLDEEAEVFIVKMWR 

LLIYETEAKKIGLVK 


3113 


A 


1 


669 

■ 


VCAG1RDPCSTPLAKPA AGG AENLSFGKQPG LET 

NILKMTTPNKTPPGADPKQLERTGTVREIGSQAV 

WSLSSCKPGFGVDQLRDDNLETYWQSDGSQPHL 

VNIQFRRKTTVKTLCIYADYKSDESYTPSKISVRV 

GNNFHhn^QElRQLELVEPSGWfflVPLTDNHKKPT 

RTFMIQIAVI^NHQNGRDTHMRQIKIYTPVEESSI 

GKFPRCTTIDFMMYRSIR I 


3114 


A 


1 


1613 


MTSKEESRRQQPTAGPAGQGKLPSPSEPQLPTPP 

TRSLHHFRRPLSPSREAQAHIAPSSELHLPQSQSA 

GPPPLGAGTEVELWPGRDEGSRGALPGSSGVKF 

VWRKIVRFPVSDQVRTI^ISRLMRRLLEMMQTL j 

VQFnGWRSLLGRTLGTIMNTMYVMMAQILRSH 

LIKATVIPNRVKMLPYFGIIRNRMMSTHKSKKKI 

REYYRLLNVEEGCSADEVRESFHKLAKQYHPDS 

GSNTADSATFIRIEKAYRKVLSHVIEQTNASQSK 

GEEEEDVEKFKYKTPQHRHYLSFEG1GFGTPTQR 1 

EKHYRQFRADRAAEQVMEYQKQKLQSQYFPDS 

VIVKNIRQSKQQKITQAIERLVEDLIQESMAKGDF 

DNLSGKGKPLKKPSDCSYIDPMTHNLNRILEDNG 

YQPEWILKQKEISDTIEQLREAILVSRKKLGNPMT 

PTEKKQWNHVCEQFQENIRKLNKRINDFNLIVPI 

LTRQKVHFDAQKEIVRAQKIYETLDCTKEVTDRN 

PNNLDQGEGEKTPEIKKGFLNLMDLVEIY | 


3115 


A 


1 


2036 


FRHRCGCLSYCRSRRGIRRVEPLRRARARVGPRF 

RPLCRMEIIRSNFKS^HKVYQAIEEADFFAIDGE 

FSGISDGPSVSALTNGFDTPEERYQKLKKHSMDF 

LLFQFGLC1 FK YD YTDSKYITKSFNFYVFPKPFNR 

SSPDVKFVCQSSSDDFLASQGFDFNKGFRKGIPYL 

NQEEERQLREQYDEKRSQANGAGALSYVSPNTS 

KCPVTIPEDQKKFIDQVVEKIEDIXQSEENKNLDL 

EPCTGFQRKLIYQTLSWKYPKGIHVETLETEKKE i 

RYTVISKVDEEERKRREQQKHAKEQEELNDAVG 

FSRVIHAIANSGKLVIGHNMLLDVMHTVHQFYC 

PLPADLSEFKEMTTCVFPRLLDTKLMASTQPFKD 

IINNTSLAELEKRLKETPFNPPKVESAEGFPSYDT 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
odd residue of 
peptide 
sequence 


Amino acid sequence (A<=Alanloe OCystelne, D=Asparttc Acid, 
EsGlutamlc Acid, ^Phenylalanine, G=G!ycine, H=Histidine, 
Iaboleucine, K=Lysfne, I>= Leucine, IVteMethfonioe, 
N=Asparagine, P=Pro!lne, Q^Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=Vallne, W=Tryptophan, Y^Tyroslne, 
X«=Un known, *=Stop codon, ^possible nucleotide deletion, 
V=possib!e nucleotide insertion 










ASEQLHEAGYDAYITGLCFISMANYLGSFLSPPKI 

HVSARSKLIEPFFNK1-FLMRVMDIPYLNLEGPDL 

QPKRDHVLHVTFPKEWKTSDLYQLFSAFGNIQIS 

WIDDTSAFVSLSQPEQVKIAVNTSKYAESYRIQT 

YAEYMGRKQEEKQIKRKWTEDSWKEADSKRLN 

PQCIPYTLQNHYYRNNSFTAPSTVGKRNLSPSQE 

EAGLEDGVSGEISDTELEQTDSCAEPLSEGRKKA 

KIOJCW^QOCELSPAGSISKKSPATLJTEVPDT^ 


3116 


A 


3 


1443 


TREAPMALAVAPWGRQWEEARALGRAVRMLQ 

RJLEEQCVDPRLSVSPPSLRDLLPRTAQLLREVAH 

SRRAAGGGGPGGPGGSGDFLLIYLANLEAKSRQ 

VAALLPPRGRRS ANDELFRAG SRLRRQLAKLAII 

FSHMHAELHALFPGGKYCGHMYQLTKAPAHTF 

WRESCGARCVLPWAEFESLLGTCHPVEPGCTAL 

ALRTTIDLTCSGHVSIFEFDVFTRLFQPWPTLLKN 

WQLLAVNHPGYMAFLTYDEVQERLQACRDKPG 

S YIFRPS CTRLGQW A1G YV SSDG SELQTTP ANKPL S 

QVLLEGQKDGFYLYPDGKTHNPDLTELGQAEPQ 

QRIHVSEEQLQLYWAN4DSTFELCKICAESNKDV 

KJEPCGHLLCSCCLAAWQHSDSQTCPFCRCEIKG 

WEAVSIYQFHGQATAEDSGNSSDQEGRELELGQ 

VPLSAPPLPPRPDLPPRKPRNAQPKVRLLKGNSPP 

AALGPQDPAPA 


3117 


A 


296 


3547 

* 

• 


ERHSSPLLQHILTHALMRNKKHSNNWLAQHWF 

QSSIILCFSPVGRTLRVRARXFPArVNCTAlDWFH 

AWPQEALVSVSRRFIEETKGIEPVHKDSISLFMAH 

VHTTVNEMSTRYYQNERRHNY'riPKSFLEQISLF 

KNLLKXKQKEVSEKKERLVNGIQKLKTTASQVG 

DLKARLASQEAELQLRNHDAEALITK1GLQTEKV 

SREKTIADAEERKVTAIQTEVFQKQRECEADLLK 

AEPAL VAATAALNTLNRVNL SELKAFPNPPIA VT 

NAH-AAVMVLLAPRGRWKDRSWKAAKVFMGK 

VDDFLQALINYDKEHIPENCLKWNEHYLKDPEF 

NPNLIRTKSFAAAGLCAWVINIIKFYEVYCDVEP 

KRQALAQANLELAAATEKLEAIRKKLVVSANYD 

DBKSEKIRWGQSIKSFEAQEKTLCGDVLLTAAFVS 

YVGPFTRQYRQELVHCKWVPFLQQKVSIPLTEG 

LDLISMLTDDATIAAWNNEGLPSDRMSTENAAIL 

THCTRWPLVIDPQQC^IKWIKNKYGMDLKVTHL 

GQKGFLNABSTALAFGDVn JRNLEETIDP VLDPL 

LGRNTIKKGKYIRIGDKECEFKKNFRLILHTKLAN 

PHYKPELQAQTTLLNFTVTEDGLEAQLLAEVVSI 

ERPDLEKLKLVLTKHQNDFKIELKYLEDDLLLRL 

SAAEGSFLDDTKLVERLEATKTTVAEIEHKVIEA 

KENERK1NEARECYRPVAARASLLYFVINDLQKI 

NPLYQFSLKAFNVLFHRADEQADKVEDMQGRISI 

LMESITHAVFLYTSQALFEKDKLTFLSQMAFQIL 

LRKKEIDPLELDFLLRFTVEHTHLSPVDFLTSQSW 

SAIKAIAVMEEFRGIDRDVEGSAKQWRKWVESE 

CPEKEKIJ^QEWKKKSLIQKLIIXRAMRPDRMTY 

ALRNFVEEKLG AK YVERTRLDL VKAFEES SP ATP 

IFFILSPGVDALKDLEILGKRLGFTIDSGKFHNVSL 

GQG QETV AEV ALEKASKG GHWVILQNVHL V AK 

WLGTLEKLLERFSQGSHRDYRVFMSAESAPTPD 

EHIIPQGLLENSIKri>IEPPTGMLANLHAALYNFD 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted cod 
nucleotide 

location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A-»Alaninc OCystcine, B=Aspartic Add, 
E«fllutnmic Acid FsPhenvl&Innine r3ef2tvHiij» UrsnieHrfin* 

l=Isoleudne, K-Lysine, L^Leudne, M=Methionine, 
N-Asparagi at, P«ProHne, Q=Glutamine, R»Arglnine, S=Serine, 
T~Threonine, V-Valine, W»Tryptophnn, Y^Tyrosine, 
X°Unknown, *=Stop codon, /^possible nudeotide deletion, 
^possible nudeotide insertion 










Q 


3118 


A 


1 


226 


PYSLSTSCLGSPTSPRLEMDPNCSCATGGSCTCTG 
SCKCKECKCNSCKKSECGAISRNLGLSQVRGRKP 
ELGMEE 


3119 


A 


1254 


4133 


PLATLTMEEQGHSEMEIIPSESHPfflQLLKSNREL 

lvthir>rrqclvdnllkndyfsaedaeivcacpt 

qpdkvrkildlvqskgeevsefflyllqqladay 

vdlrpwlleigfspslltqskwvntdpvsrytq 

qlrhhlgrdskfvlcy aqkeellleeiymdtime 

l vgfsneslg slnslaclldhttgelneqgetifil 

gdagvgksmllqrjlqslwatgrldagvkfffh 

frcrmf scfkesdrlclqdllfkhycyperdpee 

vfafllrfphvalk1fdgldelhsdldlsrvpds 

scpwepahplvllanllsgkllkgasklltart 

gievprqflrkkvllrgfspshlrayarrmfper 

alqdrllsqleanpnlcslcsvplfcwiifrcfqh 

fraafegspqlpix:tmtltovfllvtevhi^rm 

qpsslvqrntrspvetlhagrdtlcslgqvahr 

gmekslfvftqebvqasglqerdmqlgflralp 

elgpggdqqsyeffhltlqafftafflvlddrvg 

tqellrffqewmppagaattscyppflpfqclqg 

sgparedlfknkdhfqftnlflcgllskakqkll 

rhlvpaaalrrkrkalwahlfsslrgylnslpr 

VQVESFNQVQAMPTFTWMLRCIYETQSQKVGQL 

AARGICANYLKLTYCNACSADCSALSFVLHHFP 

KRLALDLDNNNLNDYGVRELQPCFSRLTVLRLS 

VNQ1TDGGVKVLSEELTKYKIVTYLGLYNNQITD 

VGARYVTKILDECKGLTHLKLGKNKITSEGGKY 

LALAVKNSKSISEVGMWGNQVGDEGAKAFAEA 

LRNHPSLTTLSLASNGISTEGGKSLARALQQNTSL 

EILWLTQNELNDEVAESLAEMLKVNQTLKHLWL 

IQNQITAKGTAQLADALQSNTGITEICLNGNLIKP 
EEAKVYEDEKRHCF 


3120 


A 


43 


1004 


QLWGFAAGSDSRPAMGCDGGTEPKRHELVKGPK 

KVEKVDKDAELVAQWNYCTLSQEILRRPIVACE 

LGRLYNKDAVIEFLLDKSAEKALGKAASHIKSIK 

NVTELKLSDNPAWEGDKGNTKGDKHDDLQRAR 

FICPWGLEMNGRHRFCFLRCCGCVFSERALKEI 

KAEVCHTCGAAFQEDDVIVLNGTKEDVDVLKTR 

MEERRLRAKLEKKTKKPKAAESVSKPDVSEEAP 

GPSKVKTGKPEEASLDSREKKTNLAPKSTAMNE 

SSSGKAGKPPCGATKRSIADSEESEAYKSLFTTHS 

SAKRSKEESAHWVTHTSYCF 


3121 


A 


3 


1490 


HASGPTRP V S WSFHKLKTMKHLLLLLLCVFL VK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNhTVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 

KJDKENVVNEYSSEIJEKHQLYIDETVNSNIPTNLR 

VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 

CNIPWSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 

DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 
ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 
QNEANKYQISVNKYRGTAGNALMDGASQLMGE 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A 8 Alanine OCystcinc, !u=Aspartic Acid, ] 
E=Glutamic Acid, ^Phenylalanine, G«Glycinc. H^Histidine 1 
1-Isoleucine, K=Lysine, L=Leucinc, M-Metbionine, 
N»Asparagine, P-Proline, Q°Glutaminc, R-Arginine, S=Serine, 
T»Threontne, V« Valine, \V«=Tryptophan, Y-Tyrosine, 
X=Un known, *=Stop cod on t /-possible nucleotide deletion, 
V^possible nucleotide insertion 










NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 
EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 
AKHGTDDGWWMNWKGSWYSMKKMSMKIRP I 
FFPQQ | 


3122 


A 


3 


1490 


HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 

KDNENVVNEYSSELEKHQLYIDETVNSNIPTNLR ! 

VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 

CNIPWSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW i 

DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 

ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 

QNEANKYQISVNKYRGTAGNALMDGASQLMGE ! 

NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 

EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 

AKHGTDDG W WMN WKG S WYSMKKMSMKIRP 

FFPQQ | 


3123 

• 


A 


3 


1490 


HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGG YRARPAKA AATQKKVERKAPDAG GCL 

HADPDLG VLCPTG CQLQE ALLQQERPIRNS VDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 

KDNENVVNEYSSELEKHQLYIDETVNSNIPTNLR 

VLRSILENLRSK1QKLESDVSAQMEYCRTPCTVS 

CNBPWSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRVYCPMNTENGGWTVIQNRQDGSVDFGRKW 

DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK f 

ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 

QNEANKYQISVNKYRGTAGNALMDGASQLMGE 

NRTMTTHNGMFFSTYDRDNDGWLTSDPRKQCSK 

EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 

AKHGTDDGVVWMNWKGSWYSMKKMSMKIRP 

FFPQQ J 


3124 


A 


3 


544 


RVDDFVLLRSRLALRWLSHVRRPSRRVPRMPRG 

SRSRTSRMAPPASRAPQMRAAPRPAPVAQPPAA 

APPSAVGSSAAAPRQPGLMAQMATTAAGVAVG 

SAVGHTLGHAITGGFSGGSNAEPARPDITYQEPQ 

GTQPAQQQQPCLYEIKQFLECAQNQGDIKLCEGF 

NEVLKQCRLANGLA j 


3125 


A 

* 


3 


571 


GNSYNHRSLAAYPYMSHSQHSPYLQSYHNSSAA 

AQTRGDDTDQQKTTV1ENGEIRFNGKGKKIRKPR 

TTYSSLQLQALNHRFQQTQYLALPERAELAASLG 

LTXJTQVKIWFQNKRSKFKKLLKQGSNPHESDPL 

QGSAALSPRSPALPPVWDVSASAKGVSMPPNSY 1 

MPGYSHWYSSPHQDTMQRPQMM 


3126 


A 

■ 


43 


5377 


LSVFFPIPVDGRDRGSNPSLESTSSELSTSTSEGSL 

SAMSGRNELHSRLHPHPQSSLIPMMFSPPESLLAS 

CIUIGNFAEAHQVLFTFNLKSSPSSGELMFMERY | 

QEVIQELAQVEHKBBNQNSDAGSSTIRRTGSGRST 

LQAIGSAAAAGMVFYSISDVTDKLLNTSGDPIPM 

LQEDFWISTALVEPTAPLREVLEDLSPPAMAAFD 

LACSQCQLWKTCKQLLETAERRLNSSLERRGRRI | 
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SEQIP 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alaninc OCysteine, P=Aspartic Acid, 
E^GIutaralc Acid, F=Pheny lata nine, G^GIycinc, H=H(sttdine, 
I=Isoleucine, K«Lysine, J>Leudne, M=Metbionine, 
N=>Asparagine, P~Proline, Q=Glu famine, R°Arginine, S=*$erine, 
'MTireonlne, V=Vnl$ne, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possib!e nucleotide deletion, 
\=possible nucleotide insertion 










DHVLLNADGIROFPWLQQISKSLNYLLMSASQT 

KSESVEEKGGGPPRCSITELLQMCWPSLSEDCVA 

SHTTLSQQLDQVLQSLRJEALELPEPRTPPLSSLVE 

QAAQKAPEAEAHPVQIQTQLLQKNLGKQTPSGS 

RQMDYLGTFFSYCSTLAAVLLQSLSSEPDHVEVK 

VGNPFVLLQQSSSQLVSHLLFERQVPPERLAALL 

AQENLSLSVPQVIVSCCCEPLALCSSRQSQQTSSL 

LTRLGTlJVQLHASHCLDDI^LSTPSSPRTTENPTL 

ERKPYS SPRDSSLPALTSS ALAFLKSRSKLLATVA 

CLGASPRLKVSKPSLSWKELRGRREVPLAAEQV 

ARECERLLEQFPLFEAFLLAAWEPLRGSLQQGQS 

LAVNLCGWASLSTVLLGLHSP1ALDVLSEAFEES 

LVARDWSRALQLTEVYGRDVDDLSSIKDAVLSC 

AVACDKBGWQYLFPVKDASLRSRX.ALQFVDRW 

PLESCLEBLAYCISDTAVQEGLKCELQRKLAELQ 

VYQKILGLQSPPWCDWQTU(SCCVEDPSTVMN 

MILEAQEYELCEEWGCLYPIPREHLISLHQKHLL 

HLLERRDHDKALQLLRRIPDPTMCLEVTEQSLDQ 

HTSLATSHFLANYLTTHFYGQLTAVRHREIQALY 

VGSKILLTLPEQHRASYSHLSSNPLFMLEQLLMN 

MKVDWATVAVQTLQQLLVGQEIGFTMDEVDSL 

LSRYAEKALDFPYPQRJEKRSDSVIHLQEIVHQAA 

DPETLPRSPSAEFSPAAPPGISSIHSPSLRERSFPPT 

QPSQEFVPPATPPARHQWVPDETESICMVCCREH 

FTMFNRJWfflCRRCGRLVCSSCSTBaCMVVEGCRE 

NPARVCDQCYSYCNKDVPEEPSEKPEALDSSKSE 

SPPYSFVVRVPKADEVEWILDLKEEENELVRSEF 

YYEQAPSASLCIAILNLHRDSIACGHQLIEHCCRL 

SKGLTNPEVDAGLLTD1MKQLLFSAKMMFVKAG 

QSQDLALCDSYISKVDVLNILVAAAYRHVPSLDQ 

ILQPAAVTRLRNQLLEAEYYQLGVEVSTKTGLDT 

TGAWHAWGMACLKAGNLTAAREKFSRCLKPPF 

DLNQLNHGSRLVQDVVEYLESTVRPFVSLQDDD 

YFATLRELEATLRTQSLSLAVIPEGKIMNNTYYQ 

ECLFYLHNYSTNLAIISFYVRHSCLREALLHLLNK 

ESPPEVFIEG1FQPSYKSGKLHTLENLLESIDPTLES 

WGKYLIAACQHLQKKNYYHILYELQQFMKDQV 

RAAMTCIRFF SHKAKSYTELGEKLS WLLKAKDH 

LKm-QETSRSSGRKKTTr^RKKNTTAADVSRHM 

OTLQLQMEVTRFLHRCESAGTSQITTLPLPTLFG 

NKfflvlKMDVACKVMLGGKNVEDGFGIAKRVLQ 

DFQLDAAMTYCRAARQLVEKEKYSEIQQLLKCV 

SESGMAAKSDGDTEL.LNCLEAFKR1PPQCCFCSA 

QELEGLIQAIHNDDNKVRAYLICCKLRSAYLIAV 

KQEHSRATALVQQVQQAAKSSGDAVVQDICAQ 
WLLTSHPRGAHGPGSRK. 


3127 


A 


467 


1259 


HLGPPLAWIPAASLTSTKGEFGVEDDRPARGPPP 

PKSEEASWSESGVSSSSGDGPFAGGEVDKRLHQL 

KTQLATLTSSLATVTQEKSRMEASYLADKKKMK 

QDLEDASNKAEEERARLEGELKGLQEQIAETKA 

RJLITQQHDRAQEQSDHALMLRELQKLLQEERTQ 

RQDLELRLEETREALAGRAYAAEQMEGFELQTK 

QLTREVEELKSELQAJRDEKNQPDPRLQELQEEA 

ARJLKSHFQAQLQQEMRKVIIHISFKHQPLT 


3128 


A 


1854 


798 


ASGSPAPSSSSAMAAACGPGAAGYCL1XGLHLFL 
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SEQW 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first ammo 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanlne OCysteine, D*>Aspartic Acid, 
E=Glutomic Acid, ^Phenylalanine, G-GIycine, H^HIstidloe, 
J«l5oleiidne, K 8 Lysiae, L-Leacine, M-Methionine, 
N-Asparagine, P«*ProHne, Q^Glutamlne, R-Arginint, S=Serine, 
T=Threonine, V=Valine, W«Tryptophan, Y-Tyrosine, 
X»Unknown f *=Stop codon, /^possible nucleotide deletion, 
Vpossible ouclcotide insertion 










LTAGPALGWNDPDRMLLRDVKALTLHYDRYTT 

SRRLDPIPQLKCVGGTAGCDSYTPKVIQCQNKG 

WDGYDVQWECKTDLDIAYKFGKTWSCEGYES 

SEDQYVLRGSCGLEYNLDYTELGLQKLKESGKQ 

HGFASFSDYYYKWSSADSCNMSG1JTIVVLLGIA 

FVVYKLFLSDGQYSPPPYSEYPPFSHRYQRFTNS 

AGPPPPGFKSEFTGPQNTGHGATSGFGSAFTGQQ 

GYENSGPGFWTGLGTGGILGYLFGSNRAATPFSD 

SWYYPSYPPSYPGTWNRAYSPLHGGSGSYSVCS 

NSDTKTRTASGYGGTORR 


3129 


A 


2340 


1192 


EI^RRPKQQSSEKSRNMIRNWLTIFILFPLKLVEK 

CESSVSLTVPPVVKLENGSSTNVSLTLRPPLNATL 

VllFHITFRSKOTTILELPDEVVVPPGVTNSSFQVT 

SQNVGQLTVYLHGNHSNQTGPRIRFLVIRSSAISn 

NQVIGWTVWAWSISFYPQVIMNWRRKSVIGLSF 

DFV ALNLTGFV AY SVFN1GLLWVP YDCEQFLLKY 

PNGVNPVNSNDVFFSLHAVVLTLIHVQCCLYERG 

GQRVSWPAIGFLVLAWLFAFVTMIVAAVGVITW 

LQFLFCFSYIKLAVTLVKYFPQAYMNFYYKSTEG 

WSIGNVLLDFTGGSFSLLQMFLQSYNNDQWTLIF 

GDPTKFGL G VFSIVFD VVFFIQHFCL YRKRPG YD 

QLN 


3130 


A 


31 


2026 


CWWPPLLPQLEPEPPPLRPRVAASQGGGMLGKG 

WGGGGGTKAPKPSFVSYVRPEEIHTNEKEVTEK 

EVTLHLLPGEQLLCEASTVLKYVQEDSCQHGVY 

GRLVCTDFKIAFLGDDESALDNDETQFKNKVIGE 

NDITLHCVDQIYGVFDEKKKTLFGQLKKYPEKLII 

HCKDLRWQFCLRYTKEEEVKRIVSGIIHHTQAP 

KLLKJU-FLFSYATAAQNNTVTDPK>JHTVMFDTL 

KDWCWELERTKGNMKYKAVSVNEGYKVCERL 

PAYFVVPTPLPEENVQRFQGHGIPIWCWSCHNGS 

ALLKMSAI^KEQDDGILQIQKSFLDGrVTCTIHRPP 

YEIVKTEDLSSNFLSLQEIQTAYSKFKQLFLIDNST 

EFWDTDKWFS1XESSSWLDIIRRCLKKAIEITEC 

MEAQNMNVLLLEENASDLCCLISSLVQLMMDPH 

CRTRIGFQSL1QKEWVMGGHCFLDRCNHLRQND 

KEEHQRQLSLPLTQSKSSPKRGFFREETDHLIKNL 

LGKBISKLD4SSDELQDNFREFYDSWHSKSTDYH 

GLLLPHIEGPEIKVWAQRYLRWIPEAQILGGGQV 

ATLSKLLEMMEEVQSLQEKIDERHHSQQAPQAE 

APCLLRNS ARJLSSLFPFALLQRHS SKPVLPTSG W 

KALGDEDDLAKREDEFVDLGDV 


3131 


A 


126 


965 


QSRSRPRREGVGTGSRAVLCILATCGSKMSDIGD | 
WFRSIPAITRYWFAATVAVPLVGKJLGLISPAYLF 1 
LWPEAFLYRFQIWRP1TATFYFPVGPGTGFLYLV | 
NL YFL YQYSTRIJETGArDCjKPAI} YLr^ W J 
CWITGLAMDMQLLM1PLIMSVLYVWAQLNRDM 
IVSFWFGTRFKACYLPWVILGFNYIIGGSVINELIG 
mVGHLYrTLMFRYPMDLGGRNFLSTPQFLYRW 
LPSRRGGVSGFGVPPASMRRAADQNGGGGRHN 
WGQGFRLGDQ 


3132 


A 


2 


350 


FVAGWRALTAPSTSARLRAFGWQAAARLLVFG 
ARGVGLGSGAPGSLPCYLRMDALALLGGLVNV 
ARLPERWGPGRFDYWGNSHQIMHLLSVGSILQL 
HAGWPDLLWAAHHACPRJD 
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SEQED 
NO: 


Method 


Predicted 
beginning 
| nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A 8 Alanine O Cysteine, D^Aspartic Acid, 
EXJIutam ic Acid, ^Phenylalanine, G=Glycine, H=Hlstidine, 
I«Isoleucine, K=Lysine, L=Leucine, M=Mettiionlne, 
N»Asporogine, P«Proline, Q=Glutaraine, R-Arginint, S=Scrinc, 
T-Threonlne, V=Valine, W«Tryptophan, Y=Tyroslne, 
X=Unknown, *=Stop codon, /= possible nucleotide deletion, 
^possible nucleotide Insertion 


3133 


A 


1 


2921 

« 


MTCFKGQKGEQRSHAFEANKDHKAKVPSPNLYS 

QLNALQFTVDERSILWLNQFLLDLKQSLNQFMA 

VYKLNDNSKSDEHVDVRVDGOilLKFVIPSEVKS 

ECHQDQPRAISIQSSEMIATNTRHCPNCRHSDLEA 

LFQDFKD CDFF SKTYTSFPKS CDNFNLLHPDFQRH 

AHEQDTKMHEIYKGMTPQLNKNTLKTSAATDV 

WAVYFSQFWIDYEGMKSGKGRPISFVDSFPLSIW 

ICQPTRYAESQKEPQTCNQVSLKTSQSESSDLAG 

RLKRKKIXKE YYSTESEPLTNGGQKPS SSDTFFR 

FSPSSSEADIHLLVHVHKHVSMQINHYQYLLLLF 

LHESL1LLSENLRKDVEAVTGSPASQTSICIGILLR 

SAEl^LLHPVDQAKTLKSPVSESVSPVVPDYLP 

TENGDFLSSKRKQISRDIbnURSVTVNHMSDNRS 

MSVDLSHIPLKDPLLFKSASDTNLQKGISFMDYL 

SDKHLGKISEDESSGLVYKSGSGEIGSETSDKKPS 

FYTDSSSVLNYREDSNILSFDSDGNQNILSSTLTS 

KGNETIESIFKAEDLLPEAASLSENLDISKEETPPV 

RTLKSQSSLSGKPKERCPPNLAPLCVSYKNMKRS 

SSQMSLDTISLDSMILEEQLLESDGSDSHMFLEKG 

NKKNSTTbmiGTAESVNAGANLQNYGETSPDAI 

STNSEGAQENHDDLMSWVFKITGVNGEIDIRGE 

DTEICLQVNQVTPDQLGNISLRHYLCNRPVGSDQ 

KAVIHSKSSPEISLRFESGPGAVIHSLLAEKNGFL 

QCHIENFS TEFLTS SLMNIQHFLEDETV ATVMPM 

KIQVSNTKJNLKDDSPRSSTVSLEPAPVTVHTOHL 

VVERSDDGSFHIRDSHMLNTGNDLKENVKSDSV 

LLTSGKYDLKKQRSVTQATQTSPGVPWPSQSAN 

FPEFSFDFTREQLMEENESLKQELAKAKMALAE 

AHLEKDALLHHIKKMTVE 


3134 


A 


9 


1579 

• 


EEEGLSGGGPRVPCSLWGKQTMDYDFKAKLAA 
ERERVEDLFEYEOCKVGRGTYGHVYKARRKDG j 
KPEKEYALKQDEGTGISMSACRE1ALLRELKHPN 
VIALQKVFLSHSDRKVWLLFDYAEHDLWHIIKFH 
RASKANKXPMQLPRSMVKSLLYQILDGIHYLHA 
N WVLHRDLKPANIL VMGEGPERGRVKJADMG F 
ARLFNSPLKPLADLDPVWTFWYRAPELLLGAR 
HYTKADDIWAIGCIFAELLTSEPIFHCRQEDIKTSN 
PFHHDQLDRIFSVMGFPADKDWEDIRKMPEYPT 
LQKDFRRTTYANS SLIK YMEKHKVKPDSK VFLL 
LQKLLTMDPTKRITSEQAJLQDPYFQEDPLPTLDV j 
FAGCQIPYPKREFLNEDDPEEKGDKNQQQQQNQ 
HQQPTAPPQQAAAPPQAPPPQQNSTQTNGTAGG 
AGAGVGGTGAGLQHSQDSSLNQVPPNKKPRLGP ! 
SGANSGGPVMPSDYQHSSSRLNYQSSVQGSSQS 
QSTLGYSSSSQQS SQYHPSHQAHRY | 


3135 


A 


3 


1111 


ERKMAEPPSPVHCVAAAAPTATVSEKEPFGKLQ 1 

LSSRDPPGSLSAKKVRTEEKKAPRRVNGEGGSG 

GNSRQLQPPAAPSPQSYGSPASWSFAPLSAAPSPS 

SSRSSFSFSAGTAVPSSASASLSQPGPRKLLVPPTL 

LHAQPHHLLLPAAAAAASANAKSRRPKEKREKE 

RRRHGLGGAREAGGASREENGEVKPLPRDKXKD 

KIKERDKEKEREKKKHKVMNEIKKENGEVKILL 

K5GKEKPKTMEDLQIKKVKKKKKKKHK£NE^ 

KRPKMYSKSIQTICSGLLTDVEDQAAKGILNDNI 

KDYVGKNLDTKNYDSKIPENSEFPFVSLKEPRVQ | 
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SEQID 
NO: 

* 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanioe C^Cysteine, D=»Asparn*c Acid, 
E=Glutamic Acid, F=Pheny lain nine, G=Clyeine, H=Hlstidine, 
I»Iso leucine, K=Lysine, L= Leucine, M=Methionine, 
N=Asparaglne, P^Proline, Q=Glutamine, R^Arglnine, S=Serine, 
T=Threonine, V=V8line, W=Tryptopban, Y«Tyroslne, 
X=Unknowo, *«Stop codoo, /^possible nucleotide deletion, 
possible nucleotide insertion 










NNLKRLDTLEFKQLIHIEHQPNGGASVIHCLQ 


3136 


A 


1442 


682 


taamsiftptnqirltkvavvrmkragkrfeiac 

yknkvvgwrsgvekdldevlqthsvfvnvskg 

qvakkedlisafgtddqteickqiltkgevqvsd 

kerhtqleqmfrdlativadkcvnpetkrpytvi 

lieramkdihysvktnkstkqqalevikqlkek 

mkffirahmrlrfilpvnegkklkeklkplikvies 

edygqqleivclidpgcfreidelikketk:gkgsl 

evlnlkdveegdekfe 


3137 


A 


1 

i 

• 


3143 

■ 


mvegkrhvlhggrqermrakqkgkpldcssdl 

vrlmyhhnssplhkqssgpssspaaaaapekpg 

pkaaevgddflgdfwgervwvngvkpgwqy 

lgetqfapgqwagwlddpvgkndgavggvr 

yfecpalqgiftrpskltrqptaegsgsdahsves 

ltaqnlslhsgtatppltsrviplresvlnssvkt 

gnesgsnlsdsgsvkrgekduclgdrvlvggtx 

tgvvryvgetdfakgewcgveldeplgkndga 

vagtryfqcppkfglfapihkvirigfpstspaka 

kktkrmamgvsalthspssssissvssvassvgg 

rpsrsglltetssryarkisgttalqealkekqq 

hieqllaerdleraevakatshicevekeiallk 

aqheqyvaeaeeklqrarllvesvrkekvdlsn 

qleeerrkvedlqfrveeesitkgdletqtqleh 

arigeleqslllekaqaerllreladnrlttvae 

ksrvlqleeeltlrrgeieelqqcllhsgppppdh 

pdaaeilrlrerllsaskehqresgvlrdkyeka 

lkayqaevdklraanekyaqevaglkdkvqq 

atsenmglmdn wks kjldsla sdhqksledlka 

tlnsgpgaqqkeigelkavmegdcmehqlelgn 

lqakhdletamhvkekealreklqeaqeelag 

lqrhwraqlevqasqhrlelqeaqdqrrdael 

rvhelekldveyrgqaqaieflkeqislaekkml 

dyerlqraeaqgkqeveslrekllvaenrlqav 

ealcssqhthmffismdiseetirtketveglqdkl 

nkrdke vta ltsqtemlraq vs ales kcksgek 

kvdallkekrrleaeletvsrkthdasgqlvus 

qellrkerslnelrvllleanrhspgperdlsre 

vhkae wrjkeqklkdd irglrekltgldkeksl 

sdqrryslidpssapellrlqhqlmstedalrda 

ldqaqqveklmeamrscpdkaqtignsgsangi 

hqqdkaqkqedkh 


3138 

• 


A 


110 

• 


2499 

• 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSAL 

TPSIWPQEILAKYTQKEESAEQPEFYYDEFGFRV 

YKEEGDEPGSSLLANSPLMEDAPQRLRWQAHLE 

FTHNHDVGDLTWDKIAVSLPRSEKLRSLVLAGIP 

HGMRPQLWMRLSGALQKKRNSELSYREIVKNSS 

NDETIAAKQIEKDLLRTMPSNACFASMGSIGVPR 

LRRVLRALAWLYPEIGYCQGTGMVAACLLLFLE 

EEDAFWMMSA11EULLPASYFSTTLLGVQTDQRV 

LRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAF 

ASVVDIKLLLRIWDLFFYEGSRVLFQLTLGMLHL 

KEEELIQSENSASIFNTLSDIPSQMEDAELLLGVA 

MRLAGSLTDVAVETQRRKHLAYLIADQGQLLGA 

GTLTNLSQVVRRRTQRRKSTTTALLFGEDDLEAL 

KAKKTKQTELV ADLREAILRVARHFQ CTDPKNC S 
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SEQU) 
NO: 


Metbod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to Inst amino 
acid residne or 
peptide 
sequence 


Amino acid sequence (A-AIanine OCysteine, D^Aspartic Add, 
E~Glutainlc Add, F=Phcnyl alanine, G«Gryctne, H=Histidine, 
l-lsoleuclne, KpLysine, Lr=Lcuclne, M=Methiooine, 
N=Asparagint, F»ProIine, Q=Glutamine, R^Arginlnc, S=Serlne, 
T=Tbreonine, V~Valine, \V=Tryptophan, Y-Tyroslne, 
X^Un known, *«Stop cod on, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










WSRQLPGLLPNTALTPPTPLVGLCSLWQELTPD 

YSMESHQRDHENYVACSRSHRRRAKALLDFERH 

DDDELGFRKNDIITIVSQKDEHCWVGELNGLRG 

WFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLV 

RGTLCPALKALFEHGLKKPSLLGGACHPWLFIEE 

AAGREVERDFASVYSRLVLCKTFRLDEDGKVLT 

PEELLYRAVQSVNVTHDAVHAQMDVKLRSLICV 

GLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRS 

PGWVQIKCELRVLCCFAFSLSQDWELPAKREAQ 

QPLKEGVRDMLVKHHLFSWDVDG 


3139 


A 


110 


2499 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSAL 

TPSIWPQEILAKYTQKEESAEQPEFYYDEFGFRV 

YKEEGDEPGSSLLANSPLMEDAPQRLRWQAHLE 

FTHNHDVGDLTWDKIAVSLPRSEKLRSLVLAGBP 

HGMRPQLWMRLSJ3ALQKKRNSELSYREIVKNSS 

NDETIAAKQIEKDLLRTMPSNACFASMGSIGVPR 

LRRVLRALAWLYPEIGYCQGTGMVAACLLLFLE 

EEDAFWMMSAIIEDLLPASYFSTTLLGVQTDQRV 

LRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAF j 

ASWDIKLLLRIWDLFFYEGSRVLFQLTLGMLHL 

KEEELIQSENSASIFNTLSD1PSQMEDAELLLGVA 

MRLAGSLTDVAVETQRRKHLAYLIADQGQLLGA 

GTLTNLSQVVRRRTQRRKSTITALLFGEDDLEAL 

KAKNIKQTELVADLREAILRVARHFQCTDPKNCS 

VVSRQLPGLLPNTALTPPTPLVGLCSLWQELTPD 

YSMESHQRDHENYV ACS RSHRRRAKALLDFERH 

DDDELGFRKNDIITIVSQKDEHCWVGELNGLRG 

WFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLV 

RGTLCPALKALFEHGLKKPSLLGGACHPWLFIEE 

AAGREVERDFASVYSRLVLCKTFRLDEDGKVLT 

PEELLYRAVQSVNVTHDAVHAQMDVKLRSLICV 

GLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRS 

PGWVQDCCELRVLCCFAFSLSQDWELPAKREAQ 

QPLKEGVRDMLVKHHLFSWDVDG 


3140 


A 

• 


1 


4939 


SAALGASLAIPRPGLPGVHGRGPGTLSGRAMEG 

AEPRARPERLAEAETRAADGGRLVEVQLSGGAP 

WGFTLKGGREHGEPLVITKIEEGSKAAAVDKLL 

AGDEIVGINDIGLSGFRQEAICLVKGSHKTLKLV 

VKRRSELGWRPHSWHATKFSDSHPELAASPFTST 

SGCPSWSGRHHASSSSHDLSSSWEQTNLQRTLD 

HFSSLGSVDSLDHPSSRLSVAKSNSSIDHLGSHSK 

RDSAYGSFSTSSSTPDHTLSKADTSSAENILYTVG 

LWEAPRQGGRQAQAAGDPQGSEEKLSCFPPRVP 

GDSGKGPRPEYNAJEPKLAAPGRSNFGPVWYVPD 

KKKAPSSPPPPPPPLRSDSFAATKSHEKAQGPVFS 

EAAAAQHFTALAQAQPRGDRRPELTDRPWRSAH 

PGSLGKGSGGPGCPQEAHADGSWPPSKDGASSR 

LQASLSS SDVRFPQSPHSGRHPPL YSDHSPLCADS 

LGQEPGAASFQNDSPPQVRGLSSCDQKLGSGWQ 

GPRPCVQGDLQAAQLWAGCWPSDTALGALESL 

PPPTVGQSPRHHLPQPEGPPDARETGRCYPLDKG 

AEGCSAGAQEPPRASRAEKASQRLAASITWADG 

ESSRICPQETPLLHSLTQEGKRRPESSPEDSATRPP 

PFDAHV GKPTRRSDRFATTLRNEIQMHRAKLQK 

SRSTVALTAAGEAEDGTGRWRAGLGGGTQEGPL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc OCysteine, D=>Aspartic Acid, 
E^GIutamlc Add, ^Phenylalanine, 0»Gtycine, H=Histidine, 
I=Isoleucine, K=LysJne, L= Leucine, M=Methionine, 
N=»Asparagine, P^Proline, Q=GIutamme, R=Arginine, S=Serine, 
T^Threonine, V»Valine, W*=Tryptopban f Y»Tyrosine, 
X^Unknown, *=3top codon, /^possible nucleotide deletion, 
V^posslble nucleotide insertion 


* 








AGTYKDHLKEAQARVLRATSFKRRDLDPNPGDL 

YPESLEHRMGDPDTVPHF WEAGLAQPPSSTSG GP 

HPPRIGGRRRFTAEQKLKSYSEPEKMNEVGLTRG 

YSPHQHPRTSEDTVGTFADRWKFFEETSKPVPQR 

PAQKQALHGIPRDKPERPRTAGRTCEGTEPWSRT 

TSLGDSLNAHSAAEKAGTSDLPRRLGTFAEYQAS 

WKEQRKPLEARSSGRCHSADDILDVSLDPQERPQ 

HVHGRSRSSPSTDHYKQEASVELRRQAGDPGEP 

REELPSA VRAEEGQSTPRQADAQCREGSPG SQQ 

HPPSQKAPNPPTFSELSHCRGAPELPREGRGRAG 

TLPRDYRYSEESTPADLGPRAQSPGSPLHARGQD 

SWPVSSALLSKRPAPQRPPPPKREPRRYRATDGA 

PADAPVGVLGRPFPTPSPASLDVYVARLSLSHSPS 

WSSAQPQDTPKATVCERGSQHVSGDASRPLPEA 

1JJPPKQQH1JU.QTATMETSRSPSPQFAPQKLTDK 

PPLLIQDEDSTRIERVMDNN'ri'VKMVPIKrVHSES 

QPEKESRQSLACPAEPPALPHGLEKDQDCTLSTSE 

QFYSRFCLYTRQGAEPEAPHRAQPAEiPQPLGTQV 

PPEKDRCTSPPGLSYMKAKEKTVEDLKSEELARE 

IVGKDKSLADILDPSVKIKTTMDLMEGIFPKDEH 

LLEEAQQRRKLLPKIPSPRSTEERKJEEPSVPAAVS 

LATNSTYYSTSAPKAELLIKMKDLQEQQEHEEDS 

GSDLDHDLSVKKQELIESISRKLQVLREARESLLE 

DVQANTVLGAEVEArVKGVCKPSEFDKFRMFIG 

DLDKWNLLLSLSGRLARVENALNNLDDGASPG 

DRQSLLEKQRVLIQQHEDAKELKENLDRRERIVF 

DILANYLSEESLADYEHFVKMKSALIIEQRELED 

KJHLGEECJLKCLLDSLQPERGK 


3141 

* 


A 


97 


1894 


SPRGATMETPPLPPACTKQGHQKPLDSKDDNTE 

KHCPVTVNPWHMKKAFKVMNELRSQNLLCDVT 

IVAEDMEISAHRWLAACSPYFHAMFTGEMSESR 

AKRVRIKJBVLXjWTLRMLIDYVYTAEIQVTEENV 

QVLLPAAGLLQLQDVKKTCCEFLESQLHPVNCL 

GIRAFADMHACTDLLNKANTYAEQHFADVVLSE 

EFLNLGIEQVCSLISSDKLT1SSEEKVFEAVIAWV 

NHDKDVRQEFMARLMEHVRJuPLLPREYLVQRV 

EEEALVKNSSACKNYLIEAMKYHLLPTEQRJLMK 

SVRTRLRTPMNIJKLMVVVGGQAPKAIRSAECY 

DFKEQRWHQVAELPSRRCRAGMVYLAGLVFAV 

GGFNGSLRVRTVDSYDPVKIXJWTSVANMRDRR 

STLGAAVLNGLLYAVGGFDGSTGLSSVEAYNIKS 

NE WFHVAPMNTRRS S VG VG WG GLLYA VGG YD 

GASRQYLSTVECYNATTNEWTYIAEMSTRRSGA 

GVGVLlST^LYAVGGmGPLVRKSVEVYDPTTN 

AWRQVADMNMCRRNAGVCAVNGLLYVVGGD 

DGSChnLASVEYYNPTTDKWTVVSSCMSTGRSYA 

GVTVIDKPL 


3142 


A 


1211 


1311 


FSNLTTEKVAHAXEENLSMHQMLDQTLLELNN 
M 


3143 


A 


1809 


1041 


SEELDREKKLKEDSPRKTPNKESGVPSLPVSLTSI 

KEEPKEAKHPDSQSMEESKLKNDDRKTPVNWK 

DSRG1RVAVSSPMSQHQSYIQYLHAYPYPQMYD 

PSHPAYRAVSPVLMHSYPGAYLSPGFHYPVYGK 

MSGREETEKVNTSPSVNTKTTTESKALDLLQQH 

ANQYRSKSPAPVEKATAEREREAERERDRHSPFG 
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SEQID 
NO: 


Method 


Predicted 

beginoing 

nucleotide 

location 

corresponding 

to first amino 

n*»ifi rucirt ti#» of* 
ucju ncoiuuc ui 

peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 

neotide 

sequence 


Amino add sequence (A*=AIanine OCysteine, D=Aspartic Add, 
E^GIutamic Add, ^Phenylalanine, G=Grydnc, H»H)stfdine, 
I=Isoleucine, K=*Lysiue, L^Leudne, M=Methionine, 
N=Asparogine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T«Threonine, V»Vallne, \V=Tryptopban, Y=Tyrosine, 
X=Unknown, *=-$top codon, ^possible nudeotidc deletion, 
Woossihle nucleotide insertion 










QRHLHTHHHTHVGMGYPLIPGQYDPFQGLTSAA 
LVASQQVAAQASASGMFPGQRR 


3144 


A 


78 


604 


SVSGIVLDLLPYIJHDFLSNMNLDGSAQDPEKREYS 

SVCVGREDDIKKSERMTAVVHDREWIFYHKGE 

YHAMDIRCYHSGGPLHLGDIEDFDGRPCIVCPW 

HKYKITLATGEGLYQSINPKDPSAKPKWCSKGIK 

QRIHTVTVDNGNIYVTLSNEPFKCDSDFYATGDF 

KVIKSSS 


3145 

* 


A 


2 


333 


RNSLLLPPLHLDNSTPAKMSCQQNQQQCQPPPK 
CPSPKCPPKSPVQCLPPASSGCAPSSGGCGPSSEG 
GCFL1TOHRRHHRCRRQRPNSCDRGSGQQGGGS 
GCGHGSGGCC 


3146 


A 


3 


1151 


VCTALQEFGTRSTLLRCLDSGFRPGASRGLVGSW 

AAMESTLGAGrVIAEALQNQLAWLENVWLWITF 

LGDPKJ1JFLFYFPAAYYASRRVGIAVLWISLITEW 

LNLIFKWFLFGDRPFWWVHESGYYSQAPAQVHQ 

FPSSCETGPGSPSGHCMITGAALWPIMTALSSQV 

ATRARSRWVRVMPSLAYCTFLLAVGLSRIFDLAH 

FPHQVLAGLITGAVLGWLMTPRVPMERELSFYG 

LTALAIMLGTSLIYWTLFIT^GLDLSWSISLAFKW 

CERPEWIHVDSRPFASLSRDSGAALGLGIALHSPC 

YAQVRRAQLGNGQKIACLVLAMGLLGPLDWLG 

HPPQISLFYIFNFLKYTLWPCLVLALVPWAVHMF 

SAQEAPPIHSS 


3147 


A 


1437 


594 


RSFSLSFSLLSPSEMMALGAAGATRVFVAMVAA 

ALGGHPLLGVSATLNSVLNSNADCNLPPPLGGAA 

GHPGSAVSAAPGILYPGGNKYQTIDNYQPYPCAE 

DEECGTDEYCASPTRGGDAGVQICLACRKRRKR 

CMRHAMCCPGNYCKNGICVSSDQNHFRGEffiETI 

TESFGNDHSTLDGYSRRTTLSSKMYHTKGQEGS 

VCLRSSDCASGLCCARHFWSKICKPVLKEGQVC 

TKHRRKGSHGLEIFQRCYCGEGLSCRJQKDHHQ 

ASNS SRLHTCQRH 


3148 


A 


1 


1562 


MSTLYDIRAHKAQLLRFFA SSDSNKALEQRRTLH 

TPKLEHLDRVLYEWFLGKRSEGVPVSGPMLIEK 

AKJDFYEQMQLTEPCVFSGGWLWRFKARHGIKK 

LDASSEKQSADHQAAEQFCAFFRSLAAEHGLSA 

EQVYNADETGLFWRCLPNPTPEGGAVPGPKQGK 

DRLTVLMCANATGSHRLKPLAIGKCSGPRAFKGI 

QHLPVAYKAQGNAWVDKEIFSDWFHHIFVPSVR 

EHFRT1GLPEDSKAVLLLDSSRAHPQEAELVSSN 

VFTIFLPASVASLVQPMEQGIRIUDFMRNFINPPVP 

LQGPHARYNMNDAIFSVACAWNAVPSHVFRRA 

WRKLWPSVAFAEGSSSEEELEAECFPVKPHNKSF 

AHILELVKEGSSCPGQLRQRQAASWGVAGREAE 

G GRPP AATSPAE WWS SEKTPKADQDGRGDPGE 

GEEVAWEQAAVAFDAVLRFAERQPCFSAQEVG 

QLRALRA VFRS QQ Q VRRRRG ALG A WK VE ALQ 

EGPGGCGATAQSPLPCSSTAGDN 


3149 


A 


132 


4125 


VAVMISTAPLYSGVHNWTSSDRIRMCGINEERRA 

PLSDEESTTGDCQHFGSQEFCVSSSFSKVELTAV 

GSGSNARGADPDGSATEKLGHKSEDKPDDPQPK 

MDYAGNVAEAEGLLVPLSSPGDGLKLPASDSAE 

ASNSRADCSWTPLNTQMSKQVDCSPAGVKALDS 

RQGVGEK^TFILATLGTGWVEGTLPLVTTNFSP 
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NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A"AIanine OCystcihe, D=Aspartic Add, 
c>GIutamic Acid, ^Phenylalanine, G«Glydne, H=*Histidine, 
I=Isoleucine, K^Lysine, L= Leu cine, M=Methionine, 
N^Asparaginc, P^Proline, Q»Glutamine, R«Arginine, S=Serine, 
T^Threonine, V«Valine, W~Tryptophan, Y=Tyrosine, 
X«linknown, *=Stop codon, /^possible nucleotide deletion, 
\=possiblc nucleotide insertion 










LPAPICPPAPSSASVPHSVPDAFQAPVPPSAPTLVL 

APVF1PVLAPMPASTPPAAPAPPSVPMPTPTPSSG 

PPSTP1X1PAFAPTPWAPTPAPIFTPAPTPMPAATP 

AAEPTSAPIPASFSLSRVCFPAAQAPAMQKVPLSF 

QPGTVLTPSQPLVYIPPPSCGQPLSVATLPTTLGV 

SSTLTLPVLPSYLQDRCLPGVLASPELRSYPYAFS 

VARPLTSDSKLVSLEVNRLPCTSPSGSTTTQPAPD 

GVPGPLADTSLVTASAKVLPTPQPLLPAPSGSSAP 

PHPAKMPSGTEQQTEGTSVTFSPLKSPPQLEREM 

ASPPECSEMPLDLSSKSNRQKLPLPNQRKTPPMP 

VLTPVHTSSKALLSTVLSRSQRTTQAAGGNVTSC 

LGSTSSPF\OFPEIVRNGDPSTWVKNSTALISTIPG 

TYVGVANPVPASLLLNKDPNLGLNRDPRHLPKQ 

EPISIIDQGEPKGTGATCGKKGSQAGAEGQPSTV 

KRYTPARIAPGLPGCQTKELSLWKPTGPANIYPR 

CSVNGKPTSTQVLPVGWSPYHQASLLSIGISSAG 

QLTPSQGAPIRPTSWSEFSGVPSLSSSEAVHGLP I 

EGQPRPGGSFVPEQDPVTKNKTCRIAAKPYEEQV 

NPVLLTLSPQTGTLALSVQPSGGDIRMNQGPEES 

ESHLCSDSTPKMEGPQGACGLKLAGDTKPKNQV 

LATYMSHELVLATPQNLPKMPELPLLPHDSHPKE 

LILDVVPSSRRGSSTERPQLGSQVDLGRVKMEKV 

DGDVVFNLATCFRADGLPVAPQRGQAEVRAKA 

GQARVKQESVGVFACKNKWQPDDVTESLPPKK 

MKCGKEKDSEEQQLQPQAKAWRSSHRPKCRK 

LPSDPQESTKKSPRGASDSGKEHNGVRGKHKHR 

KPTKPESQSPGKRADSHEEGSLEKKAKSSFRDFIP 

WLSTRTRSQSDLKARKQKTSSSQSLEHRLRNRN 

LLLPNKVQGISDSPNGFLPNNT .KF.PACLENSEKPS 

GKRKCKTKHMATVSEEAKGKGRWSQQKTRSPK 

SPTPVKPTEPCTPSKSRSASSEEASESPTARQIPPE 

ARRLIVNKNAGETLLQRAARLGYKDWLYCLQK 

DSEDVNHRDNAGYTALHEACSRGWTDILNILLE 

HGA 


3150 


A 


3 


2795 

• 


SLRMHNLSILVRQIKFYYQETLQQLIMMSLPhfVLI 

IGKNPFSEQGTEEVKKLLLLLLGCAVQCQKKEEF 

IERIQGLDFDTKAAVAAfflQEVTHNQENVFDLQ 

WMEVTDMSQEDIEPLLIQJMALHLKRLIDERDEH 

SETIEELSEERDGLHFLPHASSSAQSPCGSPGMKR 

TESRQHLSVELADAKAKIRRLRQELEEKTEQLLD 

CKQELEQMEIELKRLQQENMNLLSDARSARMYR 

DELDALREKAVRVDKLESEVSRYKERLHDIEFY 

KARVBELKEDNQVLLETKTMLEDQLEGTRARSD 

KLHELEKENLQLKAKLHDMEMERDMDRKKIEE 

LMEENMTLEMAQKQSMDESLHLGWELEQISRTS 

ELSEAPQK5LGHEVNELTSSRLLKLEMENQSLTK 

TVEEIJR^TTVDSVEGNASKILKMEKENQRLSKKV 

EBLENEIVQEKQSLQNCQNLSKDLMKEKAQLEKT 

EETLRENSERQIKILBQENEHLNQTVSSLRQRSQIS 

AEARVKDIEKJENKILHESIKETSSKLSKIEFEKRQI 

KKELEHYKEKGERAEELENELHHLEKENELLQK 

KIT^KITCEK1EA1JSQENSEL£RENR10JCKTLDS 

FKNLTFQLESLEKENSQLDEENLELRRKVESLKC 

ASMKMAQLQLENKELESEKEQLKKGLELLKASF 

KKTERLEVSYQGLDIENQRLQKTLENSNKKIQQL 
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SEQIO 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

io iiiSc a nil no 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acta resiaue oi 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteinc, D=>Aspartic Acid, 
E-GIatamic Add, F-Pheoyla!anine, G=Glyclnc, H^Histidine, 
I<=>Isoleucioe, K«Lysine, L^Leucine, M-Methionine, 
N=Asparagine, P=Proline t Q=Gtutamine, R=Arginine t S=Serine, 
^Threonine, V=Valine, W=Tryptophan, Y=»Tyroslne, 

Vr^l f— li ■> m A— jCIah 1 — — / ri i- j * LI- - — - _■ rf _J _ % * * « ! 

A B unKnown, w =oiop coaon, /^possible nucleotide deletion, i 
V=possible nucleotide insertion 










ESELQDIJEMENQTLQKNLEELKISSKRLEQLEKE 
NKSLEQETSQLEKDKKQLEKENKRLRQQAEIKD 
TTLEENNVKIGNLEKENKTLSKEIGIYKESCVRLE 
ELEKENKELViaO\TIDIKTL\rn^REDLVSEKLKT I 
QQMNNDLEKLTHELEKIGLNKERLLHDEQSTDD | 
SRYKLLESKLESTLKKSLEIKEEKIAALEARLEES 
TNYNQQLRQELKTVKKK | 


3151 


A. 


2 


2515 


GFWLHLTLLGASLPAALGWMDPGTSRGPDVGV"! 

GESQAEEPRSFEVTRREGLSSHNELLASCGKKFC 

SRGSRCVLSRKTGEPECQCLEACRPSYVPVCGSD 

GRFVTEKHCKLHRAACLLGKR1TVIHSKDCFLKGD 

TCTN1AGYARLKNVLLALQTRLQPLQEGDSRQDP 

ASQKRLLVESLFRDLDADGNGHLSSSELAQHVL 

KKQDLDEDLLGCSPGDLLRFDDYNSDSSLTLREF 

YMAFQWQLSLAPEDRVSVTTVTVGLSTVLTCA ! 

VHGDLRPPIIWKRNGLTLNFLDLEDINDFGEDDS 

LY1TKVTTIHMGNYTCHASGHEQLFQTHVLQVN 

VPPVIRVYPESQAQEPGVAASLRCHAEGIPMPRIT 

WLKNGVDVSTQMSKQLSLLANGSELHISSVRYE 

DTGAYTCIAKNEVGVDEDISSLFIEDSARKTLANI 

LWREEGLSVGNMFYVFSDDGIIVIHPVDCEIQRH 

LKPTEKIFMSYEEICPQREKNATQPCQWVSAVNV 

RNRYIYVAQPALSRVLWDIQAHKVLQSIGVDPL 

PAKLSYDKSHDQVWVLSWGDVHKSRPSLQVITE 

ASTGQSQHLIRTPFAGVDDFnPPTNLIINHIRFGFI 

FNKSDPAVHKVDLETMMPLKTIGLHHHGCVPQA 

MAHTHLGGYFFIQCRQDSPASAARQLLVDSVTD 

SVLGPNGDVTGTPHTSPDGRFTVSAAADSPWLHV 

QEITVRGEIQTLYDLQINSGISDLAFQRSFTESNQ 

YNIY A ALHTEPDLLFLELSTG KVGMJLKNLKEPPA 

GPAQPWGGTHRIMRDSGLFGQYLLTPARESLFLI 

NGRQNTLRCEVSGDCGGTTWWVGEV J 


3152 


A 

* 


1 

* 


2645 


GAGWQVSLTGRWSPGREAGAGEVRQDPGSTAA ! 

SPSSCDABLSARMARGERRRRAVPAEGVRTAER 

AARGGPGRRDGRGGGPRSTAGGVALAWVLSL I 

ALGMSGRWVL A WYRARRA VTLHSAPA VLPAD S ! 

S SP A V APDLFWGTYRPHV YFGMKTRS PKPLLTG } 

LMWAQQGTTPGTPKLRHTCEQGDGVGPYGWEF 

HDGLSFGRQHIQDGALRLTTEFVKRPGGQHGGD 

WSWRVTVEPQDSGTSALPLVSLFFYVVTDGKEV 

LLPEVGAKGQLKFISGHTSELGDFRFTLLPPTSPG 

DTAPKYGSYNVFWTSNPGLPLLTEMVKSRLNSW j 

FQHRPPGASPERYLGLPGSLKWEDRGPSGQGQG 

QFLIQQVTLKIPISIEFVFESGSAQAGGNQALPRLA 

GSLLTQALESHAEGFRERFEKTFQLKEKGLSSGE 

QVLGQAALSGLLGGIGYFYGQGLVLPDIGVEGSE 

QKVDPALFPPVPLFTAVPSRSFFPRGFLWDEGFH 

QLWQRWDPSLTREALGHWLGLLNADGWIGRE 

QILGDEARARVPPEFLVQRAVHANPPTLLLPVAH 1 

MLEVGDPDDLAFLRKALPRLHAWFSWLHQSQA 

GPLPLSYRWRGRDPALPTLLNPKTLPSGLDDYPR 

ASHPSVTERHLDLRCWVALGARVLTRLAEHLGE 

AEVAAELGPLAASLEAAESLDELHWAPELGVFA 

DFGNHTKAVQLKPRPPQGLVRVVGRPQPQLQYV 

DALGYVSLFPLLLRLLDPTSSRLGPLLDILADSRH | 
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SEQU> 
NO: 


Method 


Predicted 

ber»inninf? 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A*»Alanine OCystcine, D=>Aspartic Add, 
EMjlutamic Acid. F 8 Phenvlalanine. G=Glvcine. H^Histidinp 
I»Isoleudne, KHLysine, Leucine, M=Methioaine, 
N^Asparaginc, P=ProIine, Q=Glutamiue, R=Arglnine, S-=Serine, 
T«Threonine, V-Valine, W^Tryptophao, Y«Tyrosine t 
X=Unknown f *=Stop eodon, ^possible nucleotide deletion, 
^possible nucleotide insertion 










LWSPFGLRSLAASSSFYGQRNSEHDPPYWRGAV 
WLNVNYLALGALHHYGHLEGPHQARAAKLHGE 
LRANVVGNVWRQYQATGFLWEQYSDRDGRGM 
GCRPFHGWTSLVLLAMAEDY 


3153 


A 


1 

« 

• 


4312 


MVDCTDELPAAAPADSAREHGSQAGGKGRPGAA 

AVLLADLERDARQGECALPGAAMAGLAPLKPE 

ASRSSSPGPTGCIRARVAAEAGTRNPGNAGAELB 

SWLPCCHGHPETPEPRGGQLPTAPELPSVMLLNG 

DCPESLKKEAAAAEPPRENGLDEAGPGDETTGQ 

EVTVIQDTGFSVKILAPGIEPFSLQVSPQEMVQEIH 

QVLMDREDTCHRTCFSLHLDGNVLDHFSELRSV 

EGLQEGSVLRWEEPYTVREARIHVRHVRDLLKS 

LDPSDAFNGVDCNSLSFLSVFTDGDLGDSGKRK 

KGLEMDPIDCTPPEYILPGSRERPLCPLQPQNRD 

WKPLQCLKVLTMSGWNPPPGNRKMHGDLMYLF 

VITAEDRQVSITASTRGFYLNQSTAYHFNPKPASP 

RFLSHSLVELLNQISPTFKKNFAVLQKKRVQRHP 

FERIATPFQVYSWTAPQAEHAMDCVRAEDAYTS 

RLGYEEHIPGQTRDWNEELQTTRELPRKNLPERL 

LRERAIFKVHSDFTAAATRGAMAVIDGNVMAIN 

PSEETKMQMFIWNNIFFSLGFDVRDHYKDFGGD 

VAAYVAFTNDLNGVRTYNAVDVEGLYTLGTVV 

VD YRGYRVTAQ SIIPGILERDQEQSVI YGSIDFGK 

TWSHPRYLELLERTSRPLKILRHQVLNDRDEEV 

ELCSSVECKGIIGNDGRHYILDLLRTFPPDLNFLP 

VPGEELPEECARAGFPRAHRHKLCCLRQELVDA 

FVEHRYLLFMKLAALQLMQQNASQLETPSSLEN 

GGPSSLESKSEDPPGQEAGSEEEGSSASGLAKVK 

ELAETIAADDGTDPRSREV1RNACKAVGSISSTAF 

DIRFNPDIFSPGVRFPESCQDEVRDQKQLLKDAA 

AFLI^CQIPGLVKDCMEHAVLPVDGATLAEVMR 

QRGINMRYLGKVLELVLRSPARHQLDHVFKIGIG 

ELITRSAKHIFKTYLQGVELSGLSAAISHFLNCFLS 

SYPNPVAHLPADELVSKKRNKRRKNRPPGAADN 

TAWAVMTPQELWKNICQEAKNYFDFDLECETV 

DQAVETYGLQKITLLREISLKTGIQVLLKEYSFDS 

RHKPAFTEEDVLNIFPVVKHVNPKASDAFHFFQS 

GQAKVQQGFLKEGCELINEALNLFNNVYGAMH 

VETCACXRLLARLHYIMGDYAEALSNQQKAVL 

MSERVMGTEHPNTIQEYMHLALYCFASSQLSTA 

LSLLYRARYLMLLVFGEDHPEMALLDNNIGLVL 

HGVMEYDLSLRFLENALAVSTKYHGPKALKVAL 

SHHLVARVYESKAEFRSALQHEKEGYTIYKTQL 

GEDHEKTKESSEYLKCLTQQAVALQRTMNEIYR 

NGSSANffPLKFTAPSMASVLEQLNVINGILFIPLS 

QKDLENLKAEVARRHQLQEASRNRDRAEEPMA 

TEEAPAGAPGDLGSQPPAAKDPSPSVQG 


3154 


A 


416 


4082 


KFKIJKIMLLTLIILLPVVSKFSFVSLSAPQHWSCP 

EGTLAGNGNSTCVGPAPFLIFSHGNSIFRIDTEGT 

NYEQL WD A G V S V7MDFHYNEKRTY WVDLERQ 

IXQRVFLNGSRQERVO^EKNVSGMAINWINEEV 

IWSNQQEGinVTOMKGNNSHILLSALKYPANVA 

VDPVERFIFWSSEVAGSLYRADLDGVGVKALLE 

TSEKITAVSUJVlJDKRUfWIQYNREGSNSLICSCD 

YIXjGSVmSKHPTQHNLFAMSLFGDRIFYSTWK 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCysteJne, D=Aspartic Add, 
E=G)utamie Acid, ^Phenylalanine, 0=GIycine, HNHistidine, 
I»IsoIeucine, K=Lysine, L»Leucine, M=Methionine, 
N°Asparagine, P»Proline, Q=Glu taming R-Arginine, S=S trine, 
T^Threonlne, V=Valine, W«Tryptophan, Y»Tyrosine, 
X»Unknown, *°Stop codon, /"possible nucleotide deletion, 
^-possible nucleotide insertion 

* 




» 






MKTIW1ANKHTGKDMVRTNLHSSFVPLGELKVV 

HPLAQPKAEDDTWEPEQKLCKLRKGNCS STVCG 

QDLQSHLCMCAEGYALSRDRKYCEGNDWKYCE 

DVNECAFWNHGCTLGCKNTPGSYYCTCPVGFVL 

LPDGKRCHQLVSCPRNVSECSHDCVLTSEGPLCF 

CPEGSVLERDGKTCSGCSSPDNGGCSQLCVPLSP 

VS WECDCFPGYDLQLDEKSC AASGPQPFLLFAN S 

QDIRHMHFD GTDYGTLLS QQMGMVY ALDHDPV 

ENKIYFAHTALKWIERANMDGSQRERLIEEGVD 

VPEGLAVDWIGRRFYWTDRGKSLIGRSDLNGKR 

SKnTIENISQPRGIAVHPMAKM-FWTDTGINPRIE 

SSSLQGLGRLVIASSDLIWPSGITIDFLTDKLYWC 

DAKQSV1EMANLDGSKRRRLTQNDVGHPFAVA 

VFEDYVWFSDWAMPSVIRVNKRTGKDRVRLQG 

SMLKPSSLVWHPLAKPGADPCLYQNGGCEHIC 

KKRLGTAWCSCREGFMKASDGKTCLALDGHQL 

LAGGEVDLKNQVTPLDILSKTRVSBDNITESQHM 

LVAEIMVSDQDDCAPVGCSMYARCISEGEDATC 

QCLKGFAGDGKLCSDIDECEMGVPVCPPASSKCI 

NTEGGYVCRCSEGYQGDGIHCLDIDECQLGVHS 

CGENASCTNTEGGYTCMCAGRLSEPGLICPDSTP 

PPHLREDDHHYSVRNSDSECPLSHDGYCLHDGV 

CMYIEALDKYACNCVVGYIGERCQYRDLKWWE 

LRHAGHGQQQKVIWAVCVWLVMLLLLSLWG 

AH YYRTQKJLLSKNPKNP YEES SRD VRSRRPADT 

EDGMSSCPQPWFVVDCEHQDLKNGGQPVAGED 

G Q AADGSMQPTS WRQEPQLCGMGTEQGCWIPV 

SSDKGSCPQVMERSFHMPSYGTQTLEGGVEKPH 

SLLSANPLWQQRALDPPHQMELTQ 


3155 


A 


533 


212 


GTSGWYWERLAERRGRLWSREEAMATMENKVI 
CALVLVSMLALGTLAEAQTETCTVAPRERQNCG 
FPGVTPSQCANKGCCFDDTVRGVPWCFYPNTID 
VPPEEECEF 


3156 


A 


2 


1585 


PRVRAADVAAGAQAVVSAGMAKSNGENGPRAP 

A AGESLSGTRESLAQGPDAATTDELS SLGSDSEA 

NGFAERRIDKFGFTVGSQGAEGALEEVPLEVLRQ 

RESKWLDMLNNWDKWMAKXHKKIRLRCQKGI 

PPSLRGRAWQYLSGGKVKLQQNPGKFDELDMSP 

GDPKWLDVIERDLHRQFPFHEMFVSRGGHGQQD 

LFRVLKAYTLYRPEEGYCQAQAPIAAVLLMHMP 

AEQAFWCLVQICEKYLPGYYSEKLEAIQLDGEIL 

FSLLQKVSPVAHKHLSRQKIDPLLYMTEWFMCA 

FSRTLPWSSVUIVWDMFFCEGVKIIFRVGLVLLK 

HALGSPEKVKACQGQYETTERLRSLSPKIMQEAF 

LVQEVVELPVTERQIEREHLLQLRRWQETRGELQ 

CRSPPRLHGAKADLDAEPGPRPALQPSPSIRLPLD 

APLPGSKAKPKPPKQAQKEQRKQMKGRGQLEKP 

PAPNQAMWAAAGDACPPQHVPPKDSAPKDSAP 

QDLAPQVSAHHRSQESLTSQESEDTYL 


3157 


A 


3 


601 


SSAMGSRSSHAAVIPDGDSIRRETGFSQASLLRLH j 

HRFRALDRNKKGYLSRMDLQQIGALAVNPLGDR 

TTF-SFTTPDGSQRVDFPGFVRVLAHFRPVEDEDTET 

QDPKKPEPLNSRRNKLHYAFQLYDLDRDGKISR 

HEMLQVLRLMVGVQVTEEQLENIADRTVQEAD 

EIX3DGAVSFVJb^TKSI£KMDVEHKMSIRILK 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«=Alanine OCysteine, 0= As par tic Add, 
E*=G!utamic Add, F-Phenylnlanine, G~GJycine, H=Histidine, 
I-Isoleucine, K» Lysine, L-Lcucine, M=Methionine, 
N^Asparagine, FHProline, Q=Glutamlne, R=Arg1ninc, S=Serine, 
T-Tnreonine, V»Valinc, W=Tryptophnn, Y=»Tyros!ne, 
X=Un known, *=Stop codon, /-possible nucleotide deletion, 
\=pDssibIe nucleotide insertion 


3158 


A 


2 


409 


ISSCPHTAYEGSMSTLSNFTQTLEDVFRRIFITYM 
DNWRQNTTAJSQEALQAKVDAJBNFYYVILYLMV 
MIGMFSFUV A IL VSTV KSKKREHSNDPYHQ YTVE 
DWQEKYKSQILNLEESKATIHENIGAAGFKMSP 


3159 


A 


3 


416 


PWGAAELDMGRRDAQLLAALLVLGLCALAGSE 

KPSPCQCSRLSPHNRTNCGFPGITSDQCFDNGCCF 

DSSVTGVPWCFHPLPKQESDQCVMEVSDRRNCG 

YPGISPEECASRKCCFSNFIFEVPWCFFPKSVEDC 

HY 


3160 


A 


179 


409 


KPKTKILKMVYYPELFVWVSQEPFPNKDMEGRL 
PKGRLPVPKEVNRKKNDETNAASLTPLGSSELRS 
PRJSYL.HFF 


3161 


A 


683 


1186 


LSSTGGLHAAACAAAMSLVIPEKFQHILRVLNTN 

mGRRKIAFAITADCGVGRRYAHVVLRKADIDLT 

KRAGELTEDEVERVITIMQNPRQYKIPDWFLNRQ 

KDVKDGKYSQVLANGLDNKLREDLERLKKIRA 

HRGLRHFWGLRVRGQHTKTTGRRGRTVGVSKK 

K 


3162 


A 

• 


1 


1938 


GMPRSRGGRAAPGPPPPPPPPGQAPRWSRWRVP 

GRLLLLLLPALCCLPGAARAAAAAAGAGNRAA 

VAVAVARADEAEAPFAGQNWLKSYGYLLPYDS 

RASALHSAKALQSAVSTMQQFYG1PVTGVLDQT 

TIEWMKXPRCGVPDHPHLSRRRRNKRYALTGQK 

WRQKHITYS1HNYTPKVGELDTRKAIRQAFDVW 

QKVTPLTFEEVPYHEIKSDRKEADIMIFFASGFHG 

DSSPFDGEGGFLAHAYFPGPGIGGDTHFDSDEPW 

TLGNANHDGNDLFLVAVHELGHALGLEHSSDPS 

AIMAPFYQYMETHNFKLPQDDLQGIQKIYGPPAE 

PLEPTRPLPTLPVRRIHSPSERKHERQPRPPRPPLG 

DRPSTPGTKPNICDGNFNTVALFRGEMFVFKDR 

WFWRLRNNRVQEGYPMQffiQFWKGLPARIDAA 

YERADGRFVFFKGDKYWWKEVTVEPGYPHSLG 

ELGSCLPREGIDTALRWEPVGKTYFFKGERYWR 

YSEERRATDPGYPKPITVWKGIPQAPQGAFISKE 

GYYTYFYKGRDYWKFDNQKLSVEPGYPRN1LRD 

WMGCNQKEVERRKERJU.fn3DDVDIMVTINDVP 

GSVNAVAVVEPCDLSLCILVLVYTIFQFKNKTGPQ 

PVTYYKRPVQEWV ! 


3163 

• 


A 


1235 


2223 


SRLSLQFYV SFRRTGLFTCKLIVEIFFRNYMNDSL 

RTNVFVRFQPETIACACIYLAARALQIPLPTRPHW 

FLLFGTTEEEIQEICIETLRLYTRKKPNYELLEKEV 

EKRKVALQEAKLKAKGLNPDGTPALSTLGGFSP 

ASKPSSPREVKAEEKSPISINVKTVKKEPEDRQQA 

SKSPYNGVRKDSKRSRNSRSASRSRSRTRSRSRS 

HTPRRHYNNRRSRSGTYSSRSRSRSRSHSESPRR 

HrmHGSPHLKAKHTRDDLKSSNRHGHKRKKSRS 

RSQSKSRDHSDAAKKHRHERGHHRDRRERSRSF 

ERSHKSKHHGGSRSGHGRHRR 


3164 


A 


3 


3274 


DCRLQAAMPTNFTVVPVEAHADGGGDETAERT 

EAPGTPEGPEPERPSPGDGNPRENSPFLNNVEVE 

QESFFEGKNMALFEEEMDSNPMVSSLLNKLANY 

TNLSQGWEHEEDEESRRREAKAPRMGTFIGVY 

LPCLQNILGVILFLRLTWTVGVAGVLESFLIVAMC 

CTCTMLTAISMS AIATNG WP AG GS YYMISRSLG 

PEFGGAVGLCFYLGTTFAGAMYTLGTEEIFLTYISP 



r 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to bst amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^ Alanine OCystelne, D=>Aspartic Acid, 
E*=*G)utaroic Acid, ^Phenylalanine, G^GIycine, H°Histidine, 
I=Isoleuclne, K=»Lysine, L=Leudne, M=Methionfne, 
N»Asparagine, P=Prollne, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W«=Tryptophnn, Y«Tyrosine, 
X«Unknown, *=Stop codon, ^possible nudeotide deletion, 
^possible nucleotide insertion 








i 


GAAIFQAEAAGOEAAAMLHNMRVYGTCTLVLM 

ALVVFVGVKYVNKLALVFLACVVLSILAIYAGVI 

KSAFDPFDIPVCLLGNRTLSRRSFDACVKAYGIH 

NNSATSALWGLFCNGSQPSAACDEYFIQNNVTEI 

QGEPGAASGVFLENLWSTYAHAGAFVEKKGVPS 

VPVAEESRASTLPYVLTDIAASFTLLVGIYFPSVT 

GIMAGSNRSGDLKDAQKSIPTGTILAIVTTSFIYLS 

CIVLFGACIEGWLRDKFGEALQGNLVIGMLAW 

PSPWVTVIGSFFSTCGAGLQTLTGAPRLLQAIARD 

GIVPFLQVFGHGKANGEPTWALLLTVLICETGELI 

ASLDSVAPILSMFFLMCYLFVNLACAVQTLLRTP 

NWRPRFKFYHWTLSFLGMSLCLALMFICSWYYA 

LSAMLIAGCIYKYIEYRGAEKEWGDGIRGLSLNA 

ARYALLRVEHGPPHTKNWRPQVLVMLNLDAEQ 

AMKHPRLLSFTSQLKAGKGLTIVGSVLEGTYLD 

KHMEAQRAEENIRSLMSTEKTKGFCQLVVSSSLR 

DGMSHLIQSAGLGGLKHNTVLMAWPASWKQED 

NPFSWKNFVDTVIU^TTAAHQALLVAKNVDSFPQ 

NQERFGGGHIDVWWIVHDGGMLMLLPFLLRQH 

KVWRKCRMRIFTVAQVDDNSIQMKKDLQMFLY 

HLRISAEVEVVEMVENDISAFTYERTLMMEQRS 

QMLKQMQLSKNEQEREAQLIHDRNTASHTAAA 

ARTQAPPTPDKVQMTWTREKLIAEKYRSRDTSL 

SGFKPLFSMKPDQSNVRRMHTAVKLNGVVLNK 

SQDAQL VLLNMPGPPKNRQG DEN YMEFLEVLTE 

GLNRVLLVRGGGREV1TIYS 


3165 


A 

» 


3 


2681 


GRGARGGSGAGALRGCRGYLQKLSGKGPSRGY 

RSRWFVFDARRCYLYYFKSPQDALPLGHLDIAD 

ACFSYQGPDEAAEPGTEPPAHFQVHSAGAVTVL 

KAPNRQLMTYWLQELQQKRWEYCNSLDMVKW 

DSRTSPTPGDFPKGLVARDNTDLIYPHPNASAEK 

ARNVLAVETVPGELVGEQAANQPAPGHPNSINF 

YSLKQ WGNELKNSMS SFRPGRGHNDSRRTVFYT 

NEEWELLDPTPKDLEESIVQEEKJKKLTPEGNKGV 

TGSGFPFDFGRNPYKGKRPLKDIIGSYKNRHSSG 

DPS SEGTSG SGS VSIRKPASEMQLQVQSQQEELE 

QLKKDLSSQKELVRLLQQTVRSSQYDKYFTSSRL 

CEGVPKDTLELLHQKDDQILGLTSQLERFSLEKE 

SLQQEVRTLKSKVGELNEQLGMLMETIQAKDEV 

inaSEGEGNGPPPTVAPSSPSVVPVARDQLELDR 

LKDNLQGYKTQNKFLNKEILELSALRRNPERRER 

DLMARNSSLEAKLCQIESKYLILLQEMKTPVCSE 

DQGPTREVIAQLLEDALQVESQEQPEQAFVKPHL ! 

VSEYDIYGFRTVPEDDEEEKLVAKVRALDLKTL 

YLTENQEVSTGVKWENYFASTVNREMMCSPEL 

KNLIRAGIPHEHRSKVWKWCVDRHTRKFKDNTE 

PGHFQTLLQKALEKQNPASKQEBLDLLRTLPNNK 

HYSCPTSEGIQKLRNVLLAFSWRNPDIGYCQGLN 

RLVAVALLYLEQEDAFWCLVTIVEVFMPRDYYT 

KTLLGSQVDQRVFRDLMSEKLPRLHGHFEQYKV 

DYTLITFNWFl^VVFVDSVVSDILFKIWDSFLYEGP 

KVIFRFALALFKYKEEEILKLQDSMSIFKYLRYFT 

RTILX>ARSGTDAPTTWRKSGWS 


3166 


A 


10 


4070 


FPGPTISSNSQLYRASALFETIRHEAQLSTDYKLS 
LFDLQTSSYQALQRVLVSLGHHDEALAVAERGR 
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SEQ n> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D-Aspartic Acid, | 
E=Glutamie Add, F«PbenylaIanlne, OGiydnc, H-Hlstidine, 
I=Isoleudne, K=Lysine, L^Lcudne, M=Methionine, 
N=Asparngine, P-Proline, Q=Glutamine, R»Arginioc S^Serioc, 
T^Threonine, V=Valine, W»Tryptophan t Y-Tyrosine, 
X=Un known, *-Stop codon, /-possible nucleotide deletion, 
Wpossible nucleotide insertion 




* 




• 

* 

• 


TRAFADLLVERQTGQQDSDPYSPVTTOQILEMVN 

GQRGLVLYYSLAAGYLYSWLLAPGAGrVKFHEH 

YLGEOTVENSSDFQASSSVTLPTATGSALEQHIAS 

VREALGVESHYSRACAS SETESEAGDIMDQQFEE 

MN>n<iNSVTDPTGFLRMVRRN^F>^SCQSMT^ 

LFSNTVSPTQDGTSSLPRRQS SFAKPPLRAL YDLL 

IAPMEGGLMHSSGPVGRHRQLILVLEGELYLIPF 

ALLKGSSSNEYLYERFGLLAVPS1RSLSVQSKSHL 

RKNPPTYSSSTSMAAVIGNPKLPSAVMDRWLWG 

PMPSAEEEAYMVSELLGCQPLVGSVATKERVMS 

ALTQAECVHFATHISWKLSALVLTPSMDGNPASS 

KSSFGHPYTIPESLRVQDDASDGESISDCPPLQEL 

LLTAADVLDLQLPVKLWLGSSQESNSKVAADG 

VIALTRAFLAAGAQCVLVSLWPVPVAAFKMFIH 

AFYSSLLNGLKASAALGEAMKVVQSSKAFSHPS 

NWAGFMLIGSDVKLNSPSSLIGQALTEELQHPER 

ARDALRVLLHLVEKSLQRIQNGQRNAMYTSQQS 

VENKVGGDPGWQALLTAVGFRLDPPTSGLPAAV 

FFPTSDPGDRLQQCSSTLQSLLGLPNPALQALCK 

LITASETGEQLISRAVK>JMVGMLHQVLVQLQAG 

EKEQDLASAPIQVSISVQLWRLPGCHEFLAALGF 

V1XEVGQEEVILKTGKQANRRTVHFALQSLLSLF 

DSTELPKRLSLDSSSSLESLASAQSVSNALPLGYQ 

QPPFSPTGADSIASDAISVYSLSSIASSMSFVSKPE 

GGSEGGGPGGRQDHDRSKNAYLQRSTLPRSQLP 

PQTRPAGNKDEEEYEGFSIISNEPLATYQENKNTC 

FSPDHKQPQPGTAGGMRVSVSSKGSISTPNSPVK 

MTLIPSPNSPFQKVGKLASSDTGESDQSSTETDST 

VKSQEESNPKLDPQELAQKILEETQSHLIAVERLQ 

RSGGQVSKSNNPEDGVQAPSSTAVFRASETSAFS 

RPVLSHQKSQPSPVTVKPKPPARSSSLPKVSSGYS 

SPTTSEMSIKDSPSQHSGRPSPGCDSQTSQLDQPL 

FKLKYPSSPYSAHISKSPRNMSPSSGHQSPAGSAP 

SPALSYSSAGSARSSPADAPDIDKLKMAAIDEKV 

QAVHNLKMFWQSTPQHSTGPMKEFRGAPGTMTS 

KRDVI^LLhnLSPRPNKKEEGVDKLELKELSLQQH 

DGAPPKAPPNGHWRTETTSLGSLPLPAGPPATAP 

ARPLRLPSGNGYKFLSPGRFFPSSKC 


3167 


A 


1 

■ 


762 

» 


AARRRQKGKEENMMMDLFETGSYFFYLDGENV 
TLQPLEVAEGSPLYPGSDGTLSPCQDQMPPEAGS 
DSSGEEHVLAPPGLQPPHCPGQCLIWACKTCKRK 
SAPTDRRKAATLRERRRLKKINEAFEALKRRTVA 
NPNQRLPKVEILRSAISY 1KRLQDLLHRLDQQEK 
MQELGVDPFSYRPKQENLEGADFLRTCSSQWPS 
VSDHSRGLVTTAKEGGASIDSSASSSLRCLSSIVDS 
IS SEERKLPCVEE WEK 


3168 


A 


701 


246 


TSRRVTMKFNPFVTSDRSKNRJK^HFNAPSHVRR 

KIMSSPLSKELRQKYNVRSMPIRKDDEVQVVRG 

HYKGQQIGKWQVYRKKYVIY1ERVQREKANGT 

TVHVGIHPSKVVITRUCLDKDRKKILERKAKSRQ 

VGKEKGKYKEELIEKMQE 


3169 


A 


156 


3168 . 

• 


GPGGAISLSVEAKAGADLLVKGKQARMDIYDTQ 
TLGVVWGGFMVVSAIGIFLVSTFSMKETSYEEA 
LANQRKEMAKTHHQKVEKKXKEKTVEKKGKT 
KKKJEEKPNGKIPDHDPAPNVTVLLREPVRAPAV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A— Alanine OCystcinc, D=Aspartfc Acid, 
E-Clutamic Add, ^Phenylalanine, G-GIydne, H=>Histidine, 
Msoleuctne, K-Lysinc, L-Lcucinc, M»Mcthionine, 
N^Asparagine, ^Proline, Q^GIutarainc, R^Argininc, S=Serinc, 
T-Threooine, V-Valine, W-Tryptophan t Y«Tyrosine, 
X«Unknown, *«Stop codon, /^possible nucleotide deletion, 
V=possibIe nudeotid e i nsertion 










AVAPTPVQPPIIVAPVATVPAMPQEKLASSPKDK 

KKKEKKVAKVEPAVSSVVNSIQVLTSKAAILETA 

PKEGRNTDVAQSPEAPKQEAPAKKKSGSKKKGP 

PDADGPLYLPYKTLVSTVGSMVFNEGEAQRLEEI 

LSEKAGHQDTWHKATQKGDPVAILKRQLEEKEK 

U^TEQEDAAVAKSKLRELNKEMAAEKAKAAA 

GEAKVKKQLVAREQEITAVQARMQASYREHVK 

EVQQLQGKIRTLQEQLENGPNTQLARLQQENSIL 

RDALNQATSQVESKQNAELAKLRQELSKVSKEL 

VEKSEAVRQDEQQRKALEAKAAAFEKQVLQLQ 

ASHRESEEALQKRLDEVSRELCHTQSSHASLRAD 

AEKAQEQQQQMAELHSKLQSSEAEVRSKCEELS 

GLHGQLQEARAENSQLTERIRSffiALLEAGQARD 

AQDVQASQAEADQQQTRLKELESQVSGLEKEAI 

ELREAVEQQKVKNNDLREKNWKAMEALATAEQ 

ACKEKLHSLTQAKEESEKQLCLIEAQTMEALLAL 

LPELSVX.AQQhTVTEWLQDLKEKGPTLLKHPPAP 

AEPSSDLASKLREAEETQSTLQAECDQYRSILAET 

EGMLRDLQKSVEEEEQVWRAKVGAAEEELQKS 

RVTVKHLEEI VEKLKG ELE S S DQ VREHTSHLEAE 

LEKHMAAASAECQNYAKEVAGLRQLLLESQSQL 

DAAKSEAQKQSDELALVRQQLSEMKSHVEDGDI 

AGAPASSPEAPPAEQDPVQLKTQLEWTEADLEDE 

QTQRQKLTAEFEEAQTSACRLQEELEKLRTAGPL 

ESSETEEASQUCERLEKEKKLTSDLGRAATRLQE 

LLKTTQEQLAREKDTVKKLQEQLEKAEDGSSSK 

EGTSV 


3170 

• 

V 

• 


A 


6730 


4027 


THASEKYSYGHLPTHSITAHPMVTIRISDRQRLIQ 

PY1HNYSWLLFAALALYSAHLASAEDVDGEKLD 

PQTRSSATTLRSQCMQLVGDCLMKAHQGKGLK 

ALALLGVLPDGDS SLEDHALPVTVPTGASEEQLE 

KKAVQGAELSEAGNGKRAVHEEIRPVDFKQRNK 

ADKGVSLSKDPSCQTQISDSPADASPPTGLPDAE 

DSEVSSQKPIEEKAVTPSPEQVFAECSQKRILGLL 

AAMLPPLKSGPTVPLIDLEHVLPLMFQWISNAG 

HLNETYHLTLGLLGQLDRLLPAEVDAAVIKVLSA 

KHNLFAAGDSSIVPDGWKTTHLLFSLGAVCLDS 

RVGLDWACSMAEILRSLNSAPLWRDVIATFTDH 

CIKQLPFQLKHTNJJb-lLLVLVGFPQVLCVGTRCV 

YMDNANEPHNVIIIJCHFTEKNRAVWDVKT*RK^ 

KTVKDYQLVQKGGGQECGDSRAQLSQYSQHFA 

FIASHLLQSSMDSHCPEAVEATWVLSLALKGLY 

KTLKAHGFEEIRATFLQTDLLKLLVKKCSKGTGF 

SKTWLLRDLE1LSIMLYSSKKEINALAEHGDLEL 

DERGDREEEVERPVSSPGDPEQKKLDPLEGLDEP 

TRJCFLMAHDAiNAPLHILRAIYELQMKKTDYFF 

LEVQKRFDGDELTTDERIRSLAQRWQPSKSLRLE 

EQSAKAVDTDMHLPCLSRPARCDQATAESNPVT 

QKLISSTESELQQSYAKQRRSKSAALLHKELNCK 

SKRAVRDYLFRVNEATAVLYARHVLASLLAEWP 

SHVPVSEDILELSGPAHMTYIIJDMFMQLEEKHE 

WEKWMQTELVLTHQVLPLPHRLPPVSASWSEA 

TCVAVQLPDRCECSKGRVTVSSPKDWASEELRG 

PERDFQLNQKALSPSSQFPSAEELRHIR 


3171 


A 


557 


89 


GTRAGPVKDREAFQRLNFLYQAAHCVLAQDPEN 
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SEQID 
NO* 


Method 


Predicted 

httHnninf? 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 

location 

corresponding 

to last amino 

acid residue of 

peptide 

sequence 


Amino acid sequence (A«Alan!ne OCysteine, D=Aspartic Acid, 
Fnf^lnfomtr Acid F=PhenvIalanine. G=Glvcine HnHUririin* 
Wsoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparoginc, P«Proline, Q=C1utamine, R=Argininc, S=Serine, 
T»Threonine, V=Valine, W^Tryptophan, Y=Tyrosiue, 
X«Unknown, *=»Stop codon, ^possible nucleotide deletion, 
\-possible nucleotide insertion 

• 










QALARFY CYTERTIAKRLVT.RRDPSVKRTLCRGC 
SSLLVPGLTCTQRQRRCRGQRWTVQTCLTCQRS 
QRFLNDPGHLLWGDRPEAQLGSQADSKPLQPLP 
NTAHSISDRLPEEKMQTQGSSNQ 


3172 


A 


2 


496 


FRRAGAGRGRRRGEVTSPLSPEPLAFQSLATSRR 

PEPQTTQTVRSSALPAPPASPMSQYAPSPDFKRA 

LDSSPEANTEDDKTEEDVPMPKNYLWLTIVSCFC 

PAYPINIVALVFSIMSLNSYNDGDYEGARRLGRN 

AKWVAIASinGLLnGISCAVHFl'KNA 


3173 


A 


2 

• 


4048 


FRSGGCRRRAWTSRWPQRRRSPESCEAPLSAPL 

WGPQRGLPGREPLRSRSASAIALRTIGHILALLLR 

LLHLGLGSGG CRJED WPSGRGKKEEKMKKHRRA 

LALVSCLFLCSLVWLPSWRVCCKESSSASASSYY 

SQDDNCALENEDVQFQKKDEREGPINAESLGKS 

GSNLPISPKEHKLKDDSrWVQNTESKKLSPPVVE 

TLPTVDLHEESSNAVVDSETVENISSSSTSEITPIS 

KLDEIEKSGTIP1AKPSETEQSETDCDVGEALDAS 

APIEQPSFVSPPDSLVGQHIENVSSSHGKGKITKSE 

FESKVSASEQGGGDPKSALNASDNLKNESSDYT 

KPGDIDPTSVASPKDPEDIPTFDEWKKKVMEVEK 

EKSQSMHASSNGGSHATKKVQKNRNNYASVEC 

GAKJLAANPEAKSTSAILIENMDLYMLNPCSTKI 

WFVIELCEPIQVKQLDIANYELFSSTPKDFLVSISD 

RYPTNKWIKLGTFHGRDERNVQSFPLDEQMYAK 

YVKMFIKYIKVELLSHFGSEHFCPLSLIRVFGTSM 

VEEYEEIADSQYHSERQELFDEDYDYPLDYNTGE 

DKSSKKLLG S ATN AILNMVNIAANILG AKTEDLT 

EGNKSISENATATAAPKMPESTPVSTPVPSPEYVT 

TEVHTHDMEPSTPDTPKESPIVQLVQEEEEEASPS 

TVTLLGSGEQEDESSPWFESETQIFCSELTnCCIS 

SFSEYIYKWCSVRVALYRQRSRTALSKGKDYLV 

LAQPPLLLPAESVDVSVLQPLSGELENTNIEREAE 

TWLGDLSSSMHQDDL VNHTVDA VRT ,KPSHSQT 

LSQSLLLDITPEINPLPKDEVSESVEYEAGHIPSPVI 

PQESSVEIDNETEQKSESFSSIEKPSITYETNKVNE 

LMDN1IKEDVNSMQ1FTKLSETIWPINTATVPDN 

EDGEAKMNIADTAKQTLISVVDSSSLPEVKEEEQ 

SPEDALLRGLQRTATDFYAELQNSTOLGYANGN 

LVHGSNQKESVFMRLNNRIKALEVNMSLSGRYL 

EELSQRYRKQMEEMQKAFNKTTVKLQNTSRIAE 

EQDQRQTEAIQLLQAQLTNMTQLVSNLSATVAE 

LKREVSDRQSYLVISLVLCVVLGLMLCMQRCRN 

TSQFDGDYISKLPKSNQYPSPKRCFSSYDDMNLK 

RRTSFPLMRSKSLQLTGKEVDPNDLYTVEPLKFSP 

EKKKKRCKYKIEKIETIKPEEPLHPIANGDIKGRK 

PFTNQRDFSNMGEVYHSSYKGPPSEGSSETSSQS 

EESYFCGISACTSLCNGQSQKTKTEKRALKRRRS 

KVQDQGKLnCTLIQTKSGSLPSLHDEKGNKEITV 

GTFGVTAVSGHI 


3174 


A 

» 


485 


4668 

* 


RKCSKEKASKTPSQKIP1TPCCVLQAGPEPRSLAB 

RMGADGETVVUCNMLIGVNLILLGSMIKPSECQL 

EVTTERVQRQSVEEEGGIANYNTSSKEQPVVFNH 

VYNINVPLDNLCSSGLEASAEQEVSAEDETLAEY 

MGQTSDHESQVTFTHRINFPKKACPCASSAQVLQ 

ELLSRIEMLEREVSVLRDQCNANCCQESAATGQL 
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SEQ D> 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 


Amino acid sequence (A<=Alanine OCystcinc, D»Aspartic Acid, 
E~Glutamic Add, F»Pbenylalantne, Glycine, H^Histidine, 
I«Isolcudne, K^Lyslne, L=Leucine, M=MethIoninc, 
N»Asparagine» P-Proline, Q=*Glutamine, R-Arginlne, S=Scrinc, 
T-Threoninc, V«Valine, W-Tryptophan, Y^Tyrosine, 
X»Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 


• 


i 

i • 
• 

* 

* 






DYIPHCSGHGNFSFESCGCICNEGWFGKNCSEPY 

CPLGCSSRGVCVDGQCICDSEYSGDDCSELRCPT 

DCSSRGLCVDGECVCEEPYTGEDCRELRCPGDCS 

GKGRCANGTCLCEEGYVGEDCGQRQCLNACSG 

RGQCEEGLCVCEEGYQGPDCSAVAPPEDLRVAG 

ISDRSIELEWDGPMAVTEYVISYQPTA1X3GLQLQ 

QRVPGDWSGVTITELEPGLTYNISVYAVISNILSL 

PITAKVATHLSTPQGLQFKTITETTVEVQWEPFSF 

SFDGWEISFIPKNNEGGVIAQVPSDVTSFNQTGLK 

PGEEYIVNVVALKEQARSPPTSASVSTVIDGPTQI 

LVRDVSDTVAFVEWIPPRAKVDFELLKYGLVGGE 

GGRTTFRLQPPLSQYSVQALRPGSRYEVSVSAVR 

GTNESDSATTQFTTEIDAPKNLRVGSRTATSLDL 

EWDNSEAEVQEYKWYTTLAGEQYHEVLVPRGI 

GPTTRATLTDLVPGTEYGVGISAVMNSQQSVPAT 

MNARTELDSPRDLMVTASSETS1SLIWTKASGPID 

HYRITFTPSSGIASEVTVPKDRTSYTLTDLEPGAE 

YnSVTAERGRQQSLESTVDAFTGFRPISHLHFSH 

VTSSSVNITWSDPSPPADRLILNYSPRDEEEEMME 

VSLDATKRHAVLMGLQPATEYIVNLVAVHGTVT 

SEPIVGSITTGIDPPKDmSNVTKDSVMVSWSPPV 

ASFDYYRVSYRPTQVGRLDSSWPNTVTEFTITR 

LNPATEYEISLNSVRGREESERICTLVHTAMDNP 

VDLIATNITPTEALLQWKAPVGEVENYVIVLTHF 

AVAGETILVDGVSEEFRLVDLLPSTHYTATMYAT 

NGPLTSGTISTNFSTLLDPPANLTASEVTRQSALIS 

WQPPRAEDENYVLTYKSTDGSRKELIVDAEDTWI 

RLEGLL£bTroYTVLLQAAQDTTWSSITSTAFTTG 

GRVFPHPQDCAQHLMNGDTLSGVYPIFLNGELS 

QKLQVYCDMT1 JDGGG WI VFQRRQNGQTDFFRK. 

WADYRVGFGNVEDEFWLGLDNIHRITSQGRYEL 

RVDMRDGQEAAFASYDRFSVEDSRNLYKLRIGS 

YNGTAGDSLSYHQGRPFSTEDRDNDVAVTNCA 

MSYKGAWWYKNCHRTNLNGKYGESRHSQGIN 

WYHWKGHEFSIPFVEMKMRPYNHRLMAGRKRQ 

SLQF 


3175 


A 


2 


623 


RLQLPACPALSAAHPLALPSFSSQCHRAEARAAA 

AATAJEGTMASGVTVNDEVIKVFNDMKVRKSST 

QEEUCKRKKAVLFCLSDDKRQIIVEEAKQILVGDI 

GDTVEDPYTSFVKLLPLNDCRYALYDATYETKE 

SKKEDLVFIFWAPESAPLKSKMIYASSKDAIKKK 

FTGIKHEWQVNGLDDIKDRSTLGEKLGGNVVVS 

LEGKPL 


3176 


A 


99 

0 


1567 

• 


PRGCWSSCLDAMFRLNSLSALAELAVGSRWYH 

GGSQPIQIRRRLMMVAFLGASAVTASTGLLWKR 

AHAESPPCVDNLKSDIGDKGKNKDEGDVCNHEK 

KTADLAPHPEEKKKKRSGFRDRKVMEYENRIRA 

YSTPDKIFRYFATLKVISEPGEAEVFMTPEDFVRS 

ITPNEKQPEHLGLDQYIIKRFDGKTEKISQEREKF 

ADEGSIFYTLGECGLISFSDYIFLTTVLSTPQRNFE 

IAFKMFDLNGDGEVDMEEFEQVQSIIRSQTSMG 

MRHRDRPTTGNTLKSGLCSALTTYFFGADLKGK 

LTIKNFLEFQRigLQHDVLKLEFERHDPVDGRITE 

RQFGGMLLAYSGVQSKKLTAMQRQLKKHFKEG 

KGLTFQEVENFFTFLKNINDVDTAJLSFYHMAGAS 
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NO: 

• 


lYieuiou 


rreaicceo 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


y rcoiciea end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A n Alanlne v>=Cysteine, D^Aspartic Add, 
E=Glutamic Acid, ^Phenylalanine, G-Grydne, H-Hisridine, 
I«Iso leu cine, K-Lysine, L-Leutine, M»Methionine, 
N=Asparagine t P=Proline, Q=Glutamine, R=Arginioe f S^Serine, 
T-Threonlne, V»Valine, W»Tryptophan, Y=Tyrosine, 
X**Unknown, **=Stop codon, A=possible nucleotide deletion, 
\— possible nucleotide insertion 

• III4MVVMUV f Ml#*M 










LDKVTMQQVARTVAKVELSDHVCDVVFALFDC 
DGNGELSNKEFVSIMKQRLMRGLEKPKDMGFTR 
LMQAMWKCAQETAWDFALPKQ 


3177 


A 


182 


648 


LGWGSGAAVGGRQAARGAALGRRPMAAVLG 
ALGATRRLLAALRGQSLGLAAMSSGTHRLTAEE 
RNQAILDLKAAGWSELSERDAIYKEFSFHNFNQA 
F GFMSRVALQ AEKMNHHPE WFNVYNK VQITLTS ! 
HDCGELTKKDVKLAKFIEKAAASV 


3178 


A 


8 


612 

* 


ACGCRSFCGSTVMSLLLYYALPALGSYAMLSIFF 

LRRPHLLHTPRAPTFRIRLGAHRGG SGELLENTM 

EAMENSMAQRSDLLELDCQLTRDRWWSHDE 

NLCRQSGLNRDVGSLDFEDLPLYKEKLEVYFSPG 

HFAHGSDRRMVRLEDLFQRFPRTPMSVEIKGKN 

EELIREIAGLVRRYDRNEmWASEKSSVMKKCK 


3179 

» » 


A 


88 


1496 

» 


QETSKMETLSFPRYNVAEIVIHIRNKILTGADGKN 

LTKNDLYPNPKJPEVLHMIYMRALQIVYGIRLEHF 

YMMPVNSEVMYPHLMEGFLPFSNLVTHLDSFLPI 

CRVNDFETADILCPKAKRTSRFLSGIINFIHFREAC 

RETYMEFLWQYKSSADKMQQLNAAHQEALMK 

LERLDSVPVEEQEEFKQLSDGrQELQQSLNQDFH 

QKTIVLQEGKSQKKSNISEKTKRLNELKLSWSL 

KJEIQESLKTKIVDSPEKI>KNYKEKMKDTVQKLK 

NARQEWEKYEIYGDSVDCLPSCQLEVQLYQKK 

IQDLSDNREKLASILKESLNLEDQIESDESELKKL 

KTEENSFKRLMlVKKEKLATAQrTCINKKHEDVK 

QYKRTVDEDCNKVQEKRGAVYERVTTINHEIQKI 

RLGIQQLKDAADREKLKSQE1FLNLKTALEKYHD 

GIEKAAEDSYAK1DEKTAEUCRKMFKMST 


3180 


A 


298 

• i 


7086 


GNMACWPQLRLLLWKNLTFRRRQTCQLLLEVA 

WPLFIFLILISVRLSYPPYEQHECHFPNKAMPSAG 

TLPWVQGnCNANNPCFRYFrPGEAJPGVVGNFNK 

SrVARLFSDARRLLLYSQKDTSMKDMRKVLRTL 

QQDCKSSSNLKLQDFLVDNETFSGFLYHNLSLPK 

STVDKMLRADVILHKVFLQGYQLHLTSLCNGSK 

SEEMIQLGDQEVSELCGLPREKLAAAERVLRSN 

MDILKPILRTLNSTSPFPSKELAEATKTLLHSLGT 

LAQELFSMRSWSDMRQEVMFLTNVNSSSSSTQI 

YQAVSRIVCGHPEGGGLKKSLNWYEDNNYKAL 

FGGNGTEEDAETFYDNS1TPYCNDLMKNLESSPL ! 

SRIIWKALKPLLVGKILYTPDTPATRQVMAEVNK 

TFQELAWHDLEGMWEELSPKIWTFMENSQEMD 

LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAF 

LAKHPED VQS SNGS VYTWRE AFNETN Q AIRTISR 

FMECVNLNKLEPIATEVWLINKSMELLDERKFW 

AGIVFTGITPGSIELPHHVKYKJRMGIDNVERTNK 

DCDGYWDPGPRADPFEDMRYVWGGFAYLQDW 

EQAmVLTGTEKKTGVYMQQMPYPCYVDDIFLR 

VMSRSMPLFMTLAWlYSVAVnKGrVYEKEARLK 

ETMRJMGLDNSBLWFSWFISSLIPLLVSAGLLVVI 

LKLGNLLPYSDPSWFVFLSVFAWTILQCFLIST 

LFSRANLAAACGGIIYFTLYLPYVLCVAWQDYV 

GFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQW 

DNLFESPVEEJXjF^TTSVSMMLFDTFLYGVMT 

WYIEAVFPGQYGIPRPWYFPCTKSYWFGEESDEK 

SHPGSNQKRISEICMEEEPTOLKLGVSIQNLVKVY 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCystelne, &*Aspartic Add, 
E=Clutamic Add, F«Pbenylalaolne, C=Clycine, H=Hlstldine, 
I^lsoleucinc, K=Lysine, L=Lcudne, M°Mcthlonlnc, 
N°Asparagine, P»ProIine, Q^Clutamfne, R=Arglnine, S=Serine, 
T«=Threonine, V=Vallne, \V=Tryptopban, Y«Tyroslne, 
X=Unknown, *«Stop codon, /=possib!e nudeotide deletion, 
V^possible nucleotide insertion 


* 


* 






RDGMKV AVDGL ALNFYEGQITSFLGHN GAGKTT 

TMSDLTGLFPPTS G TA YILGKDIRSEMSTIRQNLG 

VCPQHKVLFDMLTVEEHIWFYAIUJCGLSEKHVK 

AEMEQMALDVGLPS SKLKSKTSQLSG GMQRKLS 

VALAFVGGSKVVILDEPTAGVDPYSRRGIWET J J , 

KYRQGRTnLSTHHMDEADVLGDRIAnSHGKLCC 

VGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNS 

SSTVSYLKKEDSVSQSSSDAGLGSDHESDTLTID 

VSAISNLIRKHVSEARLVEDIGHELTYVLPYEAA 

KEGAFVELFHEIDDRI^DLGISSYGISETTLEEIFL 

KVAEESGVDAETSDGTLPARRNRRAFGDKQSCL 

RPFTEDD AADPNDSDIDPESRETDLLSGMD GKG S 

YQVKGWKLTQQQFVALLWKRLLIARRSRKGFF 

AQIVLPAVFVCIALVFSLIVPPFGKYPSLELQPWM 

YNEQ YTFV SNDAPEDTGTLELLNALTKDPGFGT 

RCMEGNPIPDTPCQAGEEEWTTAPVPQTIMDLFQ 

NGNWTMQNPSPACQCSSDKIKKMLPVCPPGAGG 

LPPPQRKQNTAD1LQDLTGRN1SDYLVKTYVQIIA 

KSLKNKIWVNEFRYGGFSLGVSNTQALPPSQEV 

NDATKQMKKHLKLAKDSSADRFLNSLGRFMTG 

LDTRNNVKVWFhn^GWHAISSFLNVINNAILRA 

NLQKGENPSHYGITAFNHPLNLTKQQLSEVAPM 

TTSVDVLVSICVIFAMSFVPASFVVFLIQERVSKA 

KHLQFISGVKPVIYWLSNFVWDMCNYVVPATLV 

HIFICFQQKS YV SSTNLPVLALLLLLYGWSITPLM 

YPASFVFKIPSTAYVVLTSVNLFIGINGSVATFVL 

EUOT>NKLNNINDrLKSVFLIFPHFCLGRGLIDMV 

KNQAMADALERFGENRFVSPLSWDLVGKNLFA 

MAVEGVWFLITVLIQYRFFIRPRPVNAKLSPLND 

EDEDVRRERQRILDGGGQNDILEIKELTKIYRRK 

RKPAVDRJCVGIPPGECFGLLGVNGAGKSSTFKM 

LTGDTTVTRGDAFLNRNSILSNIHEVHQNMGYCP 

QFDAITELLTGREHVEFFALLRGVPEKEVGKVGE 

WAIRKLGLVKYGEKYAGNYSGGNKRKLSTAMA 

LIGGPPWFLDEPTTGMDPKARRPLWNCALSVV 

KEGRSVVLTSHSMEECEALCTKMAIMVNGRFRC 

LGSVQHLKNRFGDGYTIVVRIAGSNPDIJCPVQDF 

FGLAFPGSVPKEKHRNMLQYQLPSSLSSLARIFSI 

LSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 

DHLKDLSLHKNQTVVDVAVLTSFLQDEKVKESY 

V 


3181 


A 


215 


1367 


PPATSQAALPEALSKGRETPRPATHPARSQDVRP 

LSCPFDFLRDNVEWSEEQAAAAERKVQENSIQR 

VCQEKQVDYEINAHKYWNDFYKIHENGFFKDR 

HWLFTEFPEI^PSQNQNHLKDWFIJBNKSEVPEC 

RNNEDGPGLIMEEQHKCS SKSLEHKTQTPPVEEN 

VTQKISDLEICADEFPGSSATYRJLEVGCGVGNTV 

FPILQTNNDPGLFVYCCDFSSTAIELVQTNSEYDP 

SRCFAFVHDLCDEEKSYPVPKGSLDIHUFVLSAI 

VPDKMQKAINRLSRLLKPG GMVLLRD YGRYDM 

AQLRFKKG Q CLSGNF YVRGDGTRVYFFTQEELD 

T1JTTAGLEKVQNLVDRRLQVNRGKQLTMYRV 

WIQCKYCKPLLSSTS 


3182 


A 


3 


1289 


GSETQHLPRDPQHLPWDPQQHQDRRRPELFHAF 
ARDSAPPPSMVLAAETTSQQERLQAIAEKRKRQ 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D-Aspartic Add, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H-Histidine, 
I=lsoteucine, K«Lysine, L^Leucine, M=Methionine, 
N°Asparagine, PHProlioe, Q=Glutamine, R=ArgJnlne, S^Serine, 
T=Threonine, V«VaIine, W=Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 






• 




AEIENKRRQLEDERRQLQHLKSKALRERWLLEG 

TPSSASEGDEDLRRQMQDDEQKTRLLEDSVSRLE 

KGIEVLERGDSAPAAAKENAAAPSPVRAPAPSPA 

KEERKTEWMNSQQTPVGTPKDKRVSNTPLRTV 

DGSPMMKAAMYSVEITVEKDKVTGETRVLSSTT 

LLPRQPLPLGIKVYEDETKWHAVDGTAENGIHP 

LSSSEVDELIHKADEVTLSEAGSTAGAAETRGAV 

EGAARTTPSRREITGVQAQPGEATSGPPGIQPGQE 

PPVTMIFMGYQNVEDEAETKKVLGLQDTITAEL 

WTEDAAEPKEPAPPNGSAAEPPTEAASREENQA 

GPEATTSDPQDLDMKKHRCKCCSIM 


3183 


A 


333 


1931 

• 


IAPTGGSHSE1QKQLGSGGDSSSQRRAERRTEPRS 

APRPRWGRSARSPGAHKLPGPPRRRDPGAWARL 

EAAAAHRHSRGSMGRRMRGAAATAGLWLLAL 

GSLL AL WGGLLPPRTELPA SRPPEDRLPRRPARS 

GGPAPAPRFPLPPPLAWDARGGSLKTFRALLTLA 

AGADGPPRQSRSEPRWHVSARQPRPEESAAVHG 

GVFWSRGLEEQVPPGFSEAQAAAWLEAARGAR 

MVALERGGCGRSSNRJLARFADGTRACVRYGINP 

EQIQGEALSYYLARLLGLQRHVPPLALARVEAR 

GAQWAQVQEELRAAHWTEGSVVSLTRWLPNLT 

DWVPAPWRSEDGRLRPLRDAGGELANLSQAEL 

VDLVQWTDLILFDYLTANFDRLVSNLFSLQWDP 

RVMQRATSNLHRGPGGALVFLDNEAGLVHGYR 

VAGMWDKYNEPLLQSVCVFRERTARRVLELHR . 

GQDAAARLLRLYRRJHEPRFPELAALADPHAQLL 

QRRLDFLAKHILHCKAKYGRRSGDLVSPGGKER 

DLGLGYG 


3184 


A 


1 


1004 


GSTHASADAWAQWFCTEALVMGAPVWYLVAA 

ALLVGFILFLTRSRGRAASAGQEPLHNEELAGAG 

RVAQPGPLEPEEPRAGGRPRRRRDLGSRLQAQR 

RAQRVAWAEADENEEBAVILAQEEEGVEKPAET 

HLSGKIGAKKLRKLEEKQARKAQREAEEAEREE 

RKRLESQREAEWKKEEERLRLEEEQKEEEERKA 

REEQAQREHEEYLKLKEAFWEEEGVGETMTEE 

QSQSFLTEFINY1KQSKVVLLEDLASQVGLRTQD 

TOnUQDLLAEGTITGVIDDRGKFTYTITEELAAVA 

NFERQRGRVSIAELAQASNSLIAWGRESPAQAPA 


3185 


A 


2981 


7173 

• 


CLLAGKFSSTLYETGGCDMSLVNFEPAARRASNI 

CDTDSHVSSSTSVRFYPHDVLSLPQIRLNRLL'nD 

TDLLEQQDIDLSPDLAATYGPTEEAAQKVKHYY 

RF WILPQL WIG INFDRLTLL ALFDRNREILENVLA 

VILAILVAFLGSILLIQGFFRDIWVFQFCLVIASCQ 

YSLLKSVQPDSSSPRHGHNRIIAYSRPVYFCICCG 

LIWLLDYGSRNLTATKFKLYGITFTNPLVFISARD 

LV1VFTLCFPIVFFIGIXPQVNTFVMYLCEQLDIHI 

FGGNATTSLLAAL YSFICSIVA VALL YGLCY GAL 

KDSWDGQHDPVLFSIFCGLLVAVSYHLSRQSSDP 

SVLFSLVQSK1FPKTEEKNPEDPLSEVKDPLPEKL 

RNSVSERLQSDLVVCIVIGVLYFAIHVSTVFTVLQ 

PAlXYVLYTLVGFVGFVTrTYVLPQVlOCQLPWH 

CFSHPLLKTLEYNQYEVRNAATMMWFEKLHVW 

LLFVEK>HIYPLI\^NEI^SSAETIASPKJGLNTELG 

ALMITVAGLKLLRSSFSSPTYQYVTV1FTVLFFKF 

DYEAFSETMLLDLFFMSILFNKLWELLYKLQFVY 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to Inst nraino 
add residue of 
peptide 
sequence 


Amino add sequence (A=AJanine OCystelne, D^Aspartic Acid, 
E=Glutamic Acid, F=PtaenyIalanlne, G=Glydne, H=Histidine, 
l^Isoleudne, K^Lysine, L^Leudne, M-Mcthionine, 
N«Asparaginc, P=*Prolinc, Q=Giutamine, R^Arglnlne, S^Serine, 
T«=Threonine, V«Valine, W»Tryptopban, Y=»Tyroslne, 
X=Unknown, *=*Stop codon, /=possib!e nucleotide deletion, 
V=possIble nucleotide insertion 




■ 






TYIAPWQITWGSAFHAFAQPFAVPHSAMLFIQAA 

VSAITSTPLNPFLGSAIFITSYVRPVKFWERDYNT 

KRVDHSNTRLA S QLDRJSfPGTYC QQRE VE AITEG 

VEEDEGFCCCEPGHIPHMLSFNAAFSQRWLAWE 

VTVTKYILEGYSrrDNSAASMLQVFDLRKVLTTY 

YVKGIIYYVTTS SKLEE WLANETMQEGLRLC AD 

RNYVDVDPTFNPNIDEDYDHRLAGISRESFCVIY 

LNWIEYCS SRRAKPVD VDKDSSLVTLCYGLCVL 

GRRALGTASHHMSSNLESFLYGLHALFKGDFRIS 

SIRDEWIFADMELLRKVVVPGIRMSIKLHQDHFT 

SPDEYDDPTVLYEAIVSHEKNLVIAHEGDPAWRS 

A VLANSPSLL ALRHV MDDGTNEYKIIMLNRRYL 

SFRVKVNKECVRGLWAGQQQELVFLRNRNPER 

GSIQNAKQALRNMINSSCDQPIGYPIFVSPLTTSY 

SDSHEQLKDILGGPIS1X3NIRNFIVSTWHRLRKGC 

GAGCNSGGNBBDSDTGGGTSCTGNNATTANNPH 

SNVTQGS1GNPGQGSGTGLHPPVTSYPPTLGTSHS 

SHSVQSGLVRQSPARASVASQSSYCYSSRHSSLR 

MSTTGFVPCRRSSTSQISLRNLPSSIQSRLSMVNQ 

MEPSGQSGLACVQHGLPSSSSSSQSIPACKHHTL 

VGFLATEGGQSSATDAQPGNTLSPANNSHSRKA 

EVIYRVQIVDPSQILEGINLSKRKELQWPDEGIRL 

KAGRNSWKDWSPQEGMEGHVIHRWVPCSRDPG 

TRSHTOKAVLLVQIDDKYVTVIETGVLELGAEV 


3186 


A 


3 


470 


SLSAMRFLAATFLLLALSTAAQAEPVQFKDCGSV 

DGVIKEVNVSPCPTQPCQLSKGQSYSVNVTFTSN 

IQSKSSKAVVHGILMGVPVPFPIPEPDGCKSGINC 

PIQKDKTYSYLNKLPVKSEYPSIKLWEWQLQDD 

KNQSLFCWEIPVQIVSHL 


3187 


A 


3 


470 


SLSAMRFLAATFLLLALSTAAQAEPVQFKDCGSV 

DGVIKEVNVSPCPTQPCQLSKGQSYSVNVTFTSN 

IQSKSSKAWHGILMGVPVPFPIPEPDGCKSGINC 

PIQKDKTYSYLNKLPVKSEYPSHCLWEWQLQDD 

KNQSLFCWEIPVQIVSHL 


3188 


A 


2 

• 


3483 


PRVRTKLDLLVNDKXRYERVGGGPKRLGRDVEM 

EEMIEQLQEKVHELEKQNDTLKNRLISAKQQLQT 

QGYRQTPYNNVQSRINTGRRKANENAGLQECPR 

KGIKFQDADVAETPHPMFTKYGNSLLEEARGEIR 

NLENVIQSQRGQDBELEHLAEILKTQLRRKENEIE 

LSLLQLREQQATDQRSNIRDNVEMKLHKQLVE 

KSNALSAMEGKFIQLQEKQRTLKISHDALMANG 

DELNMQLKEQRLKCCSLEKQLHSMKFSERRIEEL 

QDRINDLEKERELLKENYDKLYDSAFSAAHEEQ 

WKLKEQQLKVQIAQLETALKSDLTDKTEILDRL 

KTERDQNEKLVQENRELQLQYLEQKQQLDELKK 

RIKLYNQENDINADELSEALLLIKAQKEQKNGDL 

SFLVKVDSEINKDLERSMRELQATHAETVQELEK 

TRNMLIMQHKJNKJDYQMEVEAVTRKMENLQQD 

YELKVEQYVHLLDIRAARIHKLEAQLKDIAYGTK 

QYKFKPEIMPDDSVDEFDETIHLERGENLFEIHIN 

KVTFSSE VLQASGDKEPVTFCTY AFYDFELQTTP 

VVRGLHPEYNFTSQYLVHVNDLFLQYIQKNTITL 

EVHQAYSTEYETTAACQLKFHEILEKSGRIFCTAS 

LIGTKGDffNFGTVEYWFRLRVPMDQAIRLYRER 

AKALGYTTSNFKGPEHMQSLSQQAPKTAQLSSTD 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«=Alanine OCysteine, D=>Aspartic Acid, 
E^GIutamic Acid, ^Phenylalanine, G=Gtyclne, H^Histidine, 
l^Isoleucine, K=LysIne, L^Leadnc, M-Methionlne, 1 
N=Asparagine, P=Proline, Q=Giutamine, R=Arginine, S=Serine, 
T=»Threonine, V^Valine, W-Tryptophan, Y»Tyroslne, 
X»Unknown, *«Srop codon, /"possible nucleotide deletion! 
\=possi blc nucleotide insertion 










STDGNLNELHTITRCCNHLQSRASHLQPHPYVVY 

KFFDFADHDTAIIPSSNDPQFDDHMYFPWMNM 

DLDRYLKSESLSFYVFDDSDTQENIYIGKVNVPLI 

SLAHDRCISGIFELTDHQKHPAGTIHVILKWKFA | 

YLPPSGSITTEDLGNFIRSEEPEVVQRX.PPASSVST 

LVLAPRPKPRQRLTPVDKKVSFVDIMPHQSDVSQ 

EGSVDEVKENTEKMQQGKDDVSLLSEGQLAEQS 

LASSEDETEITEDLEPEV EEDMSASDSDDCIIPGPI 

SKNTKQPSEKIRIEIIALSLNDSQVTMDDTIQRLFV 

ECRFYSLPAEETPVSLPKPKSGQWVYYNYSNVIY 

VDKENNKAECRDILKAJLQKQEMPNRSLRFTVVS 

DPPEDEQDLECEDIGVAHVDLADMFQEGRDLIE 

QNIDVFDAJUUXJEGIGKLRVTVEALHALQSVYK 

QYRDDLEA 


3189 


A 


476 


1175 


MKGSGWHLRSGMVGTLITTIIJPHWRRTAHVGTN 

ILTAVSYLKGLWMECVWHSTGIYQCQIYRSLLA 

LPQDLQAARALMGISCLLSG1ACACAVIGMKC1R 

CAKGTPAKTTFAILGGTLFILAGLLCMGAVSWTT 

NDWQNFYNPLLPSGMKFEIGQALYLGFISSSLSL 

IGGTLLCLSCQDEAPYRPYQAPPRAll'llANTAP 

AYQPPAAYKJDNRAPSVTSATHSGYRLNDYV j 


3190 


A 


267 


1037 


DRMAWQGLVLAACLLMFPSTTADCLSRCSLCA 

VKTQDGPKPINPLICSLQCQAALLPSEEWERCQSF 

LSFFTPSTLGLNDKEDLGSKSVGEGPYSELAKLS 

GSFLKELEKSKPLPSISTKENTLSKSLEEKLRGLS 

DGFREGAESELMRDAQLNDGAMETGTLYLAEE 

DPKEQVKRYGGFLRKYPKRSSEVAGEGDGDSM 

GHEDLYKRYGGFLRRIRPKLKWDNQKRYGGFLR 

RQFKWTRSQEDPNAYSGELFDA 


3191 


A 


29 


574 


GTSAGAQTKGALCQLKVPTEKLPSPLPTMADEID 

FTTGDAGASSTYPMQCSAJLRKNGFVVLKGRPCK 

IVEMSTSKTGKHGHAKVHLVGIDIFTGKKYEDIC 

PSTHNMDVPNIKRNDYQLICIQDGYLSLLTETGE 

VREDLKLPEGELGKEIEGKYNAGEDVQ V SVMC A 

MSEEYAVAJKPCK 


3192 


A 


105 


1661 


KVSAJDGMQSCESSGDSADDPLSRGLRRRGQPRV 

WIGAGLAGLAAAKALLEQGFTDVTVLEASSHIG 

GRVQSVKLGHATFELGATWIHGSHGNPIYHLTE 

ANGLLEETTDGERSVGRISLYSKNGVACYLTNH ! 

GRRIPKDVVEEFSDLYNEVYNLTQEFFRHDKPVN I 

AESQNSVGVFrREE\nEWRIRNDPDDPEATKRLJCL 

AMIQQYLKVESCESSSHSMDEV SLS AFGEWTEIP 

GAHHBPSGFMRWKT J, AEGEPAHVIQLGKPVRCI 

HWDQASARPRGPEIEPRGEGDHNHDTGEGGQGG 

EEPRGGRWDEDEQWSWVECEDCELIPADHVIV 

TVSLGVLKRQYTSFFRPGIJTtKVAAlHRLGIGTT 

DKIFLEFEEPFWGPECNSLQFVWEDEAESHTLTY 

PPELWYRXICGFDVLYPPERYGHVLSGWICGEEA 

LVMEKCDDEAVAEICTEMLRQFTGNPN1PKPRRI 

LRSAWGSNPYFRGSYSYTQVGSSGADVEKLAKP 

LPYTESSKTATK 


3193 


A 


1 


1928 


QLGTRRCLRGDKVTNAMQDFLVTNLEPRFIEPQT 
ANLSVAHFKDSNS1TPLIFVLSPGTDPAADLYKFA 
EEMKFSKKLSAISLGQGQGPRAEAMMRSSEERGK 
WVFFQNCHLAPSWMPALERLBBHINPDKVHRDF 



280 



WO 01/57190 



PCT/US01/04098 



SEQ ID 

NO: 


Method 


Predicted 

beginning 

nncleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A° Ala nine OCysteine, D=»Aspartic Add, j 
E-Glutaraic Add, F=Phenylalanine, G=GIyclne, H«Histidine, 
I-Isoleuclne, KHLysine, L=Leucine, M-Methionlne, 
N=Aspnragine, P*=Proline, Q=Glutamine, R=Arglnine, S=SerIne, 
T=Threonlne, V»Valine, W»Tryptopban, Y«Tyroslne, 
X«Unknown, *=Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 










RLWLTSLPS^vOCFPVSILQNGSKMTIEPPRGVRAN 

LLKSYSSLGEDFLNSCHKVMEFKSLLLSLCLFHG 

NALERRKFGPLGFNIPYEFTDGDLRICISQLKMFL 

DEYDDIPYKVLKYTAGEINYGGRVTODWDRRCI 

MNEJEDFYNPDVLSPEHSYSASGIYHQIPPTYDLH 

GYLSYIKSLPLNDMPEIFGLHDNANITFAQNETFA 

LLGTIIQLQPKSSSAGSQGREEIVEDVTQNILLKVP 

EPINLQWVMAKYPVLYEESMKTVLVQEVIRYNR 

LLQV1TQTLQDLLKALKGLVVMSSQLELMAASL 

YNNTVPELWSAKAYPSLKPLSSWVMDLLQRLDF 

LQAWIQDGIPAVFWISGFFFPQAFLTGTLQNFAR 

KFVISIDTISFDFKVMFEAPSELTQRPQVGCYIHG 

LFLEGARWDPEAFQLAESQPK^YTEMAVIWLL 

PTPNRKAQDQDFYLCPIYKTLTRAGTLSTTGHST 

NYVIAVEIPTHQPQRHWIKRGVALICALDY 


3194 


A 


1 


1023 


IX5WTPVHAAVDTGNVDSLKLLMYHRIPAHGNS 

FNEEESESSVFDLDGGEESPEGISKPWPADLINH 

ANREGWTAAHIAASKGFKNCLEILCRHGGLEPE 

RRDKCNRTVHDVATDDCKHLLENLN/O.KIPLJUS 

VGEIEPSNYGSDDLECENTICALNIRKQTSWDDFS 

KAVSQALTNHFQAISSDGWWSLEDVTCNN1TDS 

NIGLSARSIRSITLGNVPWSVGQSFAQSPWDFMR 

KNKAEHITVLLSGPQEGCLSSVTYASMIPLQMM 

QNYLRLVEQYHNVIFHGPEGSLQDYIVHQLALCL 

KHRQMGWQDSPVEIVEELEVGCWFFPREQLLRT 

CSLVA 


3195 


A 


1 


1809 


MAASAQVSVTFEDVAVTFTQEEWGQLDAAQRT 

LYQEVMLETCGLLMSLGCPLFKPELIYQLDHRQE 

LWMATKDI^QSSYPGDKnCPKTTCPTFSHLALPE 

EVLLQEQLTQGASKNSQLGQSKDQDGPSEMQEV 

HLKIGIGPQRGKLLEKMSSERDGLGSDDGVCTKI 

TQKQVSTEGDLYECDSHGPVTDALIREEKNSYK 

CEECGKVFKKNALLVQHERIHTQVKPYECTECG 

KTFSKSTHLLQHLIIHTGEKPYKCMECGKAFNRR 

SHLTRHQRfflSGEKPYKCSECGKAFIHRSTFVLH 

HRSHTGEKPFVCKECGKAFRDRPGFIRHYHHTGE 

KPYECIECIECGKAFNRRSYLTWHQQIHTGVKPF 

EO^ECGKAFCESADLIQHYIIHTGEKPYKCMECG 

KAFNRRSHLKQHQRIHTGEKPYECSECGKAFTH 

CSTFVLHKRTHTGEKPYECKECGKAFSDRADLIR 

HFSIHTGEKP YECVEC GKAFNRS SHLTRHQQIHT 

GEKPYECIQCGKAFCRSANLIRHSIIHTGEKPYEC 

SECGKAFNRGSSLTHHQRIHTGRNP11VTDVGRP 

FMTAQTSVNIQELLLGKEFLNTTTEENLW 


3196 


A 


1400 


264 


VGFWERPLRSSRWFRRSLRRWEMLARAARGTG 
ALLLRGSLLASGRAPRRASSGLPRNTVVLFVPQQ 
EAWVVERMGRFHRIl^GLNILIPYLDRIRYVQSL 
KEIVn^TVTPEQS A VTLDNVTLQDG) G VLYLRIMDPY 

KASYGVEDPEYAVTQLAQTTMRSELGKLSLDKV 

FRERESLNASIVDAINQAADCWGIRCLRYEIKDIH 

VPPRVKESMQMQVEAERRKRATVLESEGTRESA 

INVAEGKKQAQELASEAEKAEQINQAAGEASAVL 

AKAKAKAEAERJLAAALTQHNGDAAASLTVAEQ 

YVSAFSKLAKDSNTILLPSNPGDVTSMVAQAMG 

VYGALTKAPVPGTPDSLSSGSSRDVQGTDASLDE 
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Amino acid sequence (A-Alanine 0=Cystef ne, D-Aspartic Acid, 
E=Glutaraic Add, F=Phenylalanine, O-Clycine, H-Hbtidine, 
I^lsoleudne, K-Lyaint, L^Lcucine, M«Mctnlonlne, 
N^Asparaglne, PHProline, Q-GIutamine, R 8 ArgInJne,S«Serine, 
T-Threonine, V=Voline, W=Tryptopban t Y»Tyrosine, 
X=Un known, *=Stop codon, /-possible nucleotide deletion, 
^possible nucleotide insertion 



SEQ n> 
NO: 



Method 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 
add residue of 
peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



ELDRVKMS 



3197 



66 



3632 



LWECAAAAAGQRDGGVTLFLKGRVLGRRCAAS 

LFAREVCVSTSSSRPACFLHCARARGEQMHQMA 

SGVGSMKRSPRKMWRPGEKKEPQGVVYEDVRD 

DTEDFKEPLKVVFEGSAYGLQNFNKQKKLKTCD 

DMDTFFLHYAAAEGQIELMEKJTRDSSLEVLHE 

MDDYGNTPLHCAVEKNQIESVKFLLSRGANPNL 

RNFNMMAPLHIA VQ G MNNE VMKVLLEHRTIDV 

NLEGENGNTAVIIACTTNNSEALQILLNKGAKPC 

KSNKWGCFPIHQAAFSGSKECMEIILRFGEEHGY 

SRQLHWMNNGKATPLH1j\VQNGDLEMIKMCL 

DNGAQBDPVEKGRCTAIHFAATQGATEIVKLMIS 

SYSGSVDIVNTTDGCHETMLHRASLFDHHELAD 

YLISVGADINKIDSEGRSPLILATASASWNIVNLL 

LSKGAQVDIKDNFGRNFLHLTVQQPYGLKNLRP 

EFMQMQQIKELVMDEDNDGCTPLHYACRQGGP 

GSVNNLLGFNVSIHSKSKDKKSPLHFAASYGRIN 

TCQRLLQDISDTRLLNEGDLHGMTPLHLAAKNG 

HDKWQLLLKKGALFLSDHNGWTALHHASMGG 

YTQTMKVILDTNLKCTDRLDEDGNTALHFAARE 

GHAKAVALLLSHNADIVLNKQQASFLHLALHNK 

RKEWLT1IRSKRWDECLKIFSHNSPGNKCPITEM 

IEYLPECMKVLLDFCMLHSTEDKSCRDYYIEYNF 

KYLQCPLEFTKKTPTQDVIYEPLTALNAMVQNN 

RIELLNHPVCKEYLLMKWLAYGFRAHMMNLGS 

YCLGLIPMmvVNIKPGMAFNSTGlINETSDHSEI 

LDTTNSYLIKTCMILVFLSSIFGYCKEAGQIFQQK 

RNYFMDISNVLEWirYTTGIlFVLPLFVEIPAHLQ 

WQCGAIAVYFYWMNFLLYLQRFENCGIFIVMLE 

VELKTLLRSTWFIFLLLAFGLSFYBLLNLQDPFSS 

PIXSUQTFSMMLGDINYRESFLEPYLRNELAHPV 

LSFAQLVSFTTFVPIVLMNLLIGLAVGDIAEVQKH 

ASLKRIAMQVELHTSLEKKLPLWFLRKVDQKSTI 

VYPNKPRSGGMLFHIFCFLFCTGErRQEEPNADKS 

LEMEILKQKYRLKDLTFLLEKQHELDCLIIQKMEn 

SETEDDDSHCSFQDRFKXEQMEQRKSRWNTVLR 

AVKAKTHHLEP 



3198 



51 



2177 



KEKSLHH VDQRPPL WHPGRPG TSQ S AAMNA S SE 

GESFAGSVQBPGGTTVLVELTPDimCGICKQQFN 

NLDAFVAHKQSGCQLTGTSAAAPSTVQFVSEET 

VPATQTQTTTRTITSETQTITV S APEFVFEHG YQT 

Y1JPTESNENQTATV1SLPAKSRTKKPTTPPAQKRL 

NCCYPGCQFKTAYGMKDMERHLKIHTGDKPHK 

CEVCGKCFSRKDKLKTHMRCHTGVKPYKCKTC 

DYAAADSSSLNKHLRIHSDERPFKCQICPYASRN 

SSQLTVHLRSHTGDAPFQCWLCSAKFKISSDLKR 

HMRVHSGEKPFKCEFCKVRCTMKGNLKSHIRIK 

HSGNNFKCPHCAFLGDSKATLRKHSRVHQSEHR 

EKCSECSYSCSSKAALRIHERIHCTVRPFKCNYCS 

FDSKQPSNLSKHMKKFHGDMVKTEALERKDTG 

RQSSRQVAKLDAKKSFHCDICDASFMREDSLRS 

HKRQHSEYNESKNSDVTVLQFQIDPSKQPATPLT 

VGHLQVPLQPSQVPQFSEGRVKIIVGHQVPQANT 

IVQAAAAAVNIVPPALVAQNPEELPGNSRLQILR 

QVSUAPPQSSRCPSEAGAMTQPAVLLTTHEQTD 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D=Aspartic Add, 
E=Glutaroic Add, F=PhenyIa!anine, G=Glycinc, H«=Histidine t 
Islsoleudne, KpLysine, L/-Lcudoe, M=Methionide, 
N»Asparagine, P^ProIine, Q=Glutaminc f R=Arginine, S=Serine t 
T=Threonine, V=Valine, \V«Tryptophan, Y=*Tyrosine f 
X=Unknown t *=Stop codon, /^possible nucleotide deletion, 
V^possible nudeotfde insertion 










GATLHQTLIPTASGGPQEGSGNQTFITSSGITCTD 

FEGLNALIQEGTAEVTWSDGGQNIAVATTAPPV 

FSSSSQQELPKQTYSUQGAAHPALLCPADSIPD 


3199 


A 


13 


2247 


QSFHSMEGDPSGLPLLARGASCYSLICPCPRPAD 

WSILQGTOWSILQSADWCIYNPLARHRALTGVFL 

QSADWCTYNPLARQKSSPSPHSTQEVQLASPLTR 

RPNKKDSAERNHRPAREGSVAQRQPNPAALEKA 

EPAARKRNEREGGGSQEPGREHSLEKGYWAPGL 

GPDPSMCSKQVDPSEGASSHLKHRGGSRAAHLE 

VRRLLRRLVGALVAEAGFCYVQVAEGQRWGV 

LEVAEAAAAPVQHEPTAAVATQSRWFPRGTRPG 

LCSLPIA V AALLCPG SGPG AQSGLEF VERPPPSPL 

AWLARWPLPPPAGRCPRDAPEARVPEKARAEG 

^FRFlsINYGC'fiVVGOFMTTLVLD'NCi A YNA KTOY 

SHENVSVBPNCQFRSKTARLKTFTANQIDEIKDPS 

GLFYILPFQKGYLVNWDVQRQVWDYLFGKEMY 

OVDFLDTNIIITEPYFMT^IOESMNEILFEEYOFO ' 

AVLRVNAGALSAHRYFRDNPSELCCI1VDSGYSF 

THIVPYCRSKKKKEAIIRIKVGGKLLTNHLKEIISY 

RQLHVMDETHVINQVKEDVCYVSQDFYRDMDI ! 

AKLKGEENTVMIDYVLPDFST1KKGFCKPREEMV ! 

LSGKYKSGEOILRLANERFAVPEILFNPSDIGIOE 

MGIPEAIVYSIQNLPEEMQPHFFKNrVLTGGNSLF 

PGFRDRVYSEVRCLTPTDYDVSVVLPENPITYAW 

EGGKLISENDDFEDMWTREDYEENGHSVCEEK 

FDI 


3200 


A 


3 


307 


AVQRIRHEMN1FRLTGDLSHLAAIVILLLKIWKTR 

SCAGISGKSQLLFALVFTTRYLDLFTSFISLYNTS 

MKVWYAIHRKVnFHLQCTGL^ 


3201 


A 


1 


469 


IRHEGRGQRGKMELVQVLKRGLQQITGHGGLRG 
YLRVFFRTND AK VG TL VG ED K YGNKYYEDNKQ 
FFGRHRWWY 1TEMNG KNTF WD VDG SMVPPE 
WHRWLHSMTDDPPTTKPLTARKFnVTNHKFhA^ 
GTPEQYWYSTTRKKIQEWIPPSTPYK 


3202 


A 


144 


840 


NSSQRIMATHALEIAGLFLGGVGMVGTVAVTVM ! 

PQWRVSAFIENNIWFENFWEGLWMNCVRQANI 

RMQCKJYDSLLALSPDLQAARGLMCAASVMSFL 

AFMMAILGMKCTRCTGDNEKVKAHILLTAGnFII 

TGMVVLIPVSWVANAIIRDFYNSIVNVAQKRELG 

EALYLGWTTALVLIVGGALFCCVFCCNEKSSSYR 

YSIPSHRTTQKSYHTGKKSPSVYSRSQYV 


3203 


A 


2 


473 


KYRYRRPYPVMRKICQVGPAGLAFILNISPVAHR 

VALCHLAGCQEQAAWYHTLQILFFLVSAYFFSCP 

WEKYFPGSCDIVGHGHQIFHAFLSICTLSQLEAIL 

LDYQGRQEIFLQRHGPLSVHMACLSFFFLAACSA 

ATAALLRHKVKARLTKKDS 


3204 


A 

• 


1808 


668 


PESAPLPAFISSRILPAAWRNWCSYWTRTISCHV 

QNGTYLQRVLQNCPWPMSCPGSSYRTVVRPTYK 

VMYKTVTAREWRCCPGHSRVSCEEVAGSSASLE 

PMWSGSTMRRMALRPTAFSGCLNCSKVSELTER 

UCVLEAKMTMLTVIEQPVPPTPATPEDPAPLWGP 

PPAQGSPGDGGLQDQVGAWGLPGPTGPKGDAG 

SRGPMGMRGPPGDPLLSNTFTETNNHWPQGPTG 

PPGPPGPMGPPGPPGPTGVPGSPGHIGPPGPTGPK 

GISGHPGEKGERGLRGEPGPQGSAGQRGEPGPKG 



WO 01/57190 PCT/US01/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D»Aspartic Acid, 
&=Glutamk Acid, ^Phenylalanine, G=Gtycine, H«Histidlne, 
I«lsoleocine, K«=Lyslne, l^Leudne, MHVlethiontnc, 
N=>Asparagine, IMProIine, Q=Glutamioc, R=Arginioe, S=*Serine, 
T=ThreonJne, V=Valine, W=»Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 

» 










DPGEKSHWGEGLHQLREALKILAERVLILETMIG 
LYEPELGSGAGPAGTGTPSLLRGKRGGHATNYRI 
VAPRSRDERG 


3205 


A 


2810 


1652 


RTSTQKWQS VFNDS QEHLERFYCNPENDRMRM 

KYGGQEFWADLNAMNVYETTEFDQLRRLSTPPS 

SNVNSIYHTVWKFFCRDHFGWREYPESVTRLIEE 

ANSRGLKEVRFMMWNNHYlIJIKSITRREnCRRP 

LFRSCFILLPYLQTLGGVPTQAPPPLEATSSSQIICP 

DGVTSANFYPETWVYMHPSQDFIQVPVSAEDKS 

YRIIYNIJHKTVPEFKYRILQILRVQNQFLWEKY 

KRKKEYMNRKMFGRDRII^RHLFHGTSQDVVD 

GICKHNFDPRVCGKHATMFGQGSYFAKKASYSH 

NFSKKSSKGVHFMFLAKVLTGRYTMGSHGMRR 

PPPVNPGSVTSDLYDSCVDNFFEPQIFVIFNDDQS 

YPYFVIQYEEVSNTVSI 


3206 

• 


A 

* 


297 

• 


4500 


CLVDSKLWKGARSVYHQLFMSSLLMDLK.YKKL 

FAVRFAKNYERLQSDYVTDDHDREFSVADLSVQ 

lFTVPSLARMLITEENLMSIIIKTFMDHLRHRDAQ 

GRFQFERYTALQAFKFRRVQSLILDLKYVLISKPT 

EWSDELRQKFLEGFDAFLELLKCMQGMDPITRQ 

VGQfflEMEPEWEAAFTLQMKLTHVISNIMQDWC 

ASDEKVLJDBAYKKCLAVLMQCHGGYTDGEQPIT 

LSICGHSVETIRYCVSQEKVSIHLPVSRLLAGLHV 

LLSKSEVAYKFPELLPLSELSPPMLffiHPLRCLVL 

CAQVHAGMWRRNGFSLVNQIYYYHNVKCRRE 

MFDKDVVMLQTGVSMMDPNHFLMIMLSRFELY 

QIFSTPDYGKRFSSEITHKDVVQQNNTLIEEMLYL 

IIMLVGERFSPGVGQVNATDEIKREIIHQLSIKPM 

AHSELVKSI^EDENKETGMESVIEAVAHFKKPGL 

TGRGMYELKPECAKEFNLYFYHFSRAEQSKAEE 

AQRKLKRQNREDTALPPPVLPPFCPLFASLVNILQ 

SDVMLCIMGTILQWAVEHNGYAWSESMLQRVL 

HLIGMALQEEKQHLENVEEEHVVTFTFTQKISKP 

GEAPKNSPSILAMLETLQNAPYLEVHKDMIRWIL 

KTFNAVKKMRESSPTSPVAETEGTIMEESSRDKD 

KAERKRKAEIARLRREKIMAQMSEMQRHFIDEN 

KELFQQTLELDASTSAVLDHSPVASDMTLTALGP 

AQTQVPEQRQFVTCILCQEEQEVKVESRAMVLA 

AFVQRSTVLSKNRSKJFIQDPEKYDPLFMHPDLSC 

GTHTSSCGHIMHAHCWQRYFDSVQAKEQRRQQ 

RLRLHTSYDVENGEFLCPLCECLSNTVIPLLLPPR 

NIFhWRLNFSDQPNLTQWIRTISQQIKALQFLRKE 

ESTPNNASTKNSENVDELQLPEGFRPDFRPKIPYS 

ESIKEMLTTFGTATYKVGLKVHPNEEDPRVPIMC 

WGSCAYTIQSIERILSDEDKPLFGPLPCRLDDCLR 

SLTRFAAAHWTVASVSWQGHFCKPFASLVPND 

SHEELPCILDIDMFHLLVGLVLAFPALQCQDFSGI 

SLGTGDLHIFHLVTMAHIIQILLTSCTEENGMDQE 

NPPCEEESAVLALYKTLHQYTGSALKEIPSGWHL 

WRSVRAGIMPFLKCSALFFHYLNGVPSPPDIQVP 

GTSHFEHLCSYLSLPNNLICLFQENSEIMNSLEBS 

WCRNSEVKRYLEGERDAIRYPRESNKLINLPEDY 

SSLINQASNFSCPKSGGDKSRAPTLCLVCGSLLCS 

QSYCCQTELEGEDVGACTAHTYSCGSGVGIFLR 

VRECQVLFLAGKTKGCFYSPPYLDDYGETDQGL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

seqnence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=A(anine OCysteine, D=Aspartic Add, 
£«"Glutamic Add, ^Phenylalanine, OKSIydoe, H«Histidlne, 
l»lsoteudne, K=Lysine, L/=Leudne, MeMetbionine, 
N=Asparagine, P*=Pro!ine, Q=Glutarainc, R«Arginine, S=Serine, 
TaThreonine, V=Valine, W«Tryptophan, Y^TyrosIne, 
X— Unknown, *«Stop codon, /^possible nudeotide deletion, 
V=possible nudeotide insertion 










RRGNPLHLCKERFKKIQKLWHQHSVTEEIGHAQ 
EANQTLVG1DWQHL 


3207 


A 


49 


963 


QLSPSQAPAGAQEVARRVTVGSASHGGRRSTMA 

TTVSTQRGPVYIGELPQDFLRITPTQQQRQVQLD 

AQAAQQLQ YGGA VGTV GRLNITWQAKLAKNY 

GMTRMDPYCRLRLGYAVYETPTAHNGAKNPRW 

NKVIHCTVPPGVDSFyXEIFDERAFSMDDRIAWT 

HTTIPESLRQGKVEDKWYSLSGRQGDDKEGMINL 

VMSYALLPAAMVMPPQPWLMPTVYQQGVGY 

VPITGMPAVCSPGMVPVALPPAAVNAQPRCSEE 

DLKAIQDMFPNMDQEVIRSVLEAQRGNKDAAIN 

SLLQMGEEP 


3208 


A 


54 

* 


1196 


LERTPASADMAWTKYQLFLAGLMLVTGSINTLS 

AKWADNFMAEGCGGSKEHSFQHPFLQAVGMFL 

GEFSCLAAFVXLRCRAAGQSDSSVDPQQPFNPLL 

FLPPALCDMTGTSLMYVALNMTSASSFQMLRGA 

V 11FTGLFSV AFLGRRLVLSQWLGILATIAGLVW 

GLADLLSKJHODSQHKLSEVITGDLLIIMAQIIVAIQ 

MVLEEKFVYKHNVHPLRAVGTEGLFGFVE.SLLL 

VPMYYIPAGSFSGNPRGTLEDALDAFCQVGQQP 

LIAVALLGNISS1AFFNFAGISVTKELSATTRMVL 

DSLRTVVIWALSLALGWEAFHALQILGFLILLIGT 

ALYNGLHRPLLGRLSRGRPLAEESEQERLLGGTR 

TPINDAS 


3209 


A 

■ 


104 

i 


1999 


AKVVSLKEFSCFWRREKPVSSLSSLQVKAEASW 

DSAVHGCPQLSRGTPVDERLFLIVRVTVQLSHPA 

DMQLVLRKR1CVNVHGRQGFAQSLLKKMSHRSS 

IPG CG VTFEIVSNIPEDAQG VEEREALARMA ANV 

ENP A S ADS EA YIEKYLRS VXA VENLLTLDRLRQE 

VAVKEQLTGKGKLSRRSISSPNVNRLSGSRQDLIP 

SYSLGSNKGRWESQQDVSQTTVSRGIAPAPALSV 

SPQNNHS PDPGL SNL/VA S YLNP VKSFVPQMPKLL 

KSLFPVRDEKRGKRPSPLAHQPVPRIMVQSASPDI 

RVTRMEEAQPEMGPDVLVQTMGAPALKICDKP 

AKVPSPPPVIAVTAVTPAPEAQDGPPSPLSEASSG 

YFSHSVSTATLSDALGPGLDAAAPPGSMPTAPEA 

EPEAPISHPPPPTAVPAEEPPGPQQLVSPGRERPDL 

EAPAPGSPFRVRRVRASELRSFSRMLAGDPGCSP 

GAEGNAPAPGAGGQALASDSEEADEVPEWLREG 

EFVTVGAHKTGVVRYVGPADFQEGTWVGVELD 

LPS GKNIXjSIGGKQYFRCOTG YGLL VRPSRVRR 

ATGPVRRRSTGLRLGAPEARRSATLSGSATNLAS 

LTAALAKADRSHKNPENRKSWAS 


3210 


A 


324 


694 


SPFWTEKRRMEKPLFPLVPLHWFGFGYTALWS 
GGIVGYVKTGSVPSLAAGLLFGSLAGLGAYQLY 
QDPK1NV WurLAAiov Ir VvjVMuMKoi X YvjJsJ* 

MPVGUAGAS1XMAAKVGVRMLMTSD 


3211 


A 


1078 


594 


VGNffiLPAVNLKVILLGHWLLTTWGCIVFSGSYA 
WANFTILALG\TVVAVAQRDSIDAISMFIXKjLLATI 
FLDIVHISIFYPRVSLTDTGRFGVGMAILSLLLKPL 
SCCFVYHMYRERGGELLVHTGFLGSSQDRSAYQ 
TIDSAEAPADPFAVPEGRSQDARGY 


3212 


A 


1 


1962 


FRCGLAPKGRPRRKADPVASAIMDPAEAVLQEK 
AIJCFMMEFRSWCPGWNTMARSRLTATSTSRVQ 
CSMPRSLWLGCS SLADSMPSLRCLYNPGTGALT 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acta residue or 
peptide 
sequence 


Amino acid sequence (A«A!aninc OCysteine, D=Aspartic Acid, 
E=Glutamic Add, F=Pbenyl alanine, G-CIyclne, H«Histidlnc, 
I=lso!eucine, K«=Lysine, L=Leucine, M=Methioninc, 
N^Asparagine, P«=ProIine, Q=GIutaraine, R"A rginine, S»Serlne, 
T^Tbreonine, V=Valine, W=0"ryptopban t Y-Tyroslne, 
X»Unknown v *=Stop codon, /"possible nucleotide deletion, 
V~possible nucleotide insertion 










AFQNSSEREDCNNGEPPRJKHPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MTVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 

PARGLDHIAENILSYLDAKSLCAAELVCKEWYR 

VTSIXjMLWKKLIERMVRTDSLWRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKIIQDIETIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKI WDKNTLECKRILTGHTG S VLCLQ Y 

DERVirrGSSDSTVRVWDVNTGEMLNTLIHHCEA 

VLHLRFNNGMMVTCSKDRS1AVWDMASPTDITL 

RRVLVGHRAAVNVVDFDDKYIVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLVVSGS 

SDNTDRLWDDBCGACLRVLEGHEELVRCIRFDNK 

RIVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 

LVEHSGRVFRLQFDEFQIVSSSHDDTILIWDFLND 

PAAQSEPPRSPSRTYTYISR 


3213 


A 

* 


1 


1962 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 

ALKFMMEFRSWCPGWNTMARSRLTATSTSRVQ 

CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 

AFQNSSEREDCNNGEPPRKHPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MTVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 

PARGLDH1AENILSYLDAKSLCAAELVCKEWYR 

VTSDGMLWKKLIERMVRTDSLWRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKUQDIETIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKI WDKNTLECKRILTGHTG S VLCLQ Y 

DERVIITGSSDSTVRVWDVNTGEMLNTLIHHCEA 

VLHLRFNNGMMVTCSKDRJSIAVWDMASPTDITL 

RRVLVGHRAAVNVVDFDDKYTVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLVVSGS 

SDNTIRLWDIECGACLRVLEGHEELVRC1RFDNK 

RIVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 

LVEHSGRVFRLQFDEFQIVS SSHDDTILIWDFLND 

PAAQSEPPRSPSRTYTYISR 


3214 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 

ALKFMMEFRSWCTGWNTMARSRLTATSTSRVQ 

CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 

AFQNSSEREDCNNGEPPRKQPEKNSLRQTYNSCA 

RLCLNQETVCLA STAMKTENC V AKTKLANGTS S 

MTVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 

PARGLDH1AENILSYLDAKSLCAAELVCKEWYR 

VTSDGMLWKKLEERMVRTDSLWRGLAERRGWG 

QYO s KNKPPDGNAPPNSFYRALYPKIIQDIETIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKTV 

SGLRDNTIKI WDKNTLECKRILTGHTGSVLCLQY 

DERVnTGSSDSTVRVWDVNTGEMLNTLIHHCEA 

VLHLRFNNGMMVTCSKDRSIAVWDMASPTDITT^ 

RRVLVGHRAAVNVVDFDDKYIVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLVVSGS 

SDNTIRL WDIECGACLRVLEGKEFI ,VRCIRFDNK 

RIVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, D=Aspnrtic Acid, 
^Glutamic Acid, ^Phenylalanine, G=Gtydne, H^Histidine, 
I^Isoleudne, K=Lyslnc, L=Leudne, M=Methionine, 
N^Asparagine, P=Proline, Q=Gtutamine, R=Arginine, S-Serine, 
T^Threonine, V=Valine, W«Tryptophan, Y=Tyrosine, 
X=Unknown, *=£top codon, /^possible nucleotide deletion, 
\=possible nudeodde insertion 










LVEHSGRVFRLQFDEFQWSSSHDDmiWDFLND 
PAAQSEPPRSPSRTYTYISR 


3215 


A 


2 


1376 


EARLVGCQRGGPARPGSYSSGAETAGRAMAAN 

LSRNGPALQEAYVRWTEKSPTDWALFTYEGNS 

NDIRVAGTGEGGLEEMVEELNSGKVMYAFCRV 

KDPNSGLPKTVLINWTGEGVNDVRKGACASHVS 

TMASFLKGAHVTTNARAEEDVEPECIMEKVAKA 

SGANYSFHKESGRFQDVGPQAPVGSVYQKTNAV 

SEnCRVGKDSFWAKAEKEEENRRLEEKRRAEEA 

QRQLEQERRERELREAARREQRYQEQGGEASPQ 

RTWEQQQEWSRNRNEQESAVHPREIFKQKERA 

MSTTSISSPQPGKLRSPFLQKQLTQPETHFGREPA 

AAISRPRADLPAEEPAPSTPPCLVQAEEEAVYEEP 

PEQETFYEQPPLVQQQGAGSEHIDHHIQGQGLSG 

QGLCARALYDYQAADDTEISFDPENLITG1EVTDE 

GWWRGYGPDGHFGMFPANYVELIE 


3216 


A 


936 


204 


AMASTLE YSPSPLRRLVGP AA GFSRAARADLS W 

DPMAFFTGLWGPFTCVSRVLSHHCFSTTGSLSAI 

QKMTRVRVVDNSALGNSPYHRAPRCIHVYKKN 

GVGKVGDQUXAIKGQKKKALIVGHCMPGPRMT 

PRFDSNNWLIEDNGNPVGTRIKTPIPTSLRKREG 

EYSKVLAIAQNFV 


3217 


A 


1 

■ 


1563 


MLCALLLLPSLLGATRASPTSGPQECAKGSTVW 

CQDLQTAARCGAVGYCQGAVWNKPTAKSLPCD 

VCQDIAAAAGNGLNPDATESDILALVMKTCEWL 

PSQESSAGCKWMVDAHSSAILSMLRGAPDSAPA 

QVCTALSLCEPLQRHLATLRPLSKEDTFEAVAPF 

MANGPLTFHPRQAPEGALCQDCVRQVSRLQEAV 

RSNLTLADLNIQEQCESLGPGLAVLCKNYLFQFF 

VPADQALRLLPPQELCRKG GFCEELGAPARLTQ 

VVAMDGVPSLELGLPRKQSEMQMKAGVTCEVC 

MNWQKLDHWLMSNSSELMITHALERVCSVMP 

ASITKECIILVDTYSPSLVQLVAK1TPEKVCKFIRL 

CGNIOIRARAVHDAYAIVPSPEWDAENQGSFCNG 

CKRLLTVS SHNLESKSTKRDILV AFKGGCSILPLP 

YMIQCKHFVTQYEPVLIESLKDMMDPVAVCKKV 

GACHGPRTPLLGTDQCALGPSFWCRSQEAAKLC 

NAVQHCQKHVWKEMHLHAGEHA 


3218 

« 


A 


1 


1563 


MLCALLLLPSLLGATRASPTSGPQECAKGSTVW 

CQDLQTAARCGAVGYCQGAVWNKPTAKSLPCD 

VCQDIAAAAGNGLNPDATESDILALVMKTCEWL 

PSQESSAGCKWMVDAHSS AILSMLRG APD SAPA 

QVCTALSLCEPLQRHLATLRPLSKEDTFEAVAPF 

MANGPLTFHPRQAPEGALCQDCVRQVSRLQEAV 

RSNLTLADLNIQEQCESLGPGLAVLCKNYLFQFF 

VPADQALRJLLPPQELCRKGGFCEELGAPARLTQ 

WAMDGVPSLELGLPRKQSEMQMKAGVTCEVC 

MN WQKLDHWLMSNSS ELMITHALERVCS VMP 

ASITKECIILVDTYSPSLVQLVAKJTPEKVCKFIRL 

CGNRRRARAVHDAYAIVPSPEWDAENQGSFCNG 

CKRLLTVS SHNLESKSTKRJDIL V AFKG G CSILPLP | 

YMIQCKM 7 VTQYEPVLIESLKDMMDPVAVCKKV 1 

GACHGPRTPLLGTDQCALGPSFWCRSQEAAKLC 

NAVQHCQKHVWKEMHLHAGEHA 


3219 


A 


1623 


572 


TSAEGWKGCTCTFKDRSKLREHLRSHTQEKWA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
" corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alunine OCysteine, 0= As parti c Add, 
E=Glutaraic Acid, F»Phenylalanine, G=Grydnc, H-Histidine, 
I^lsoleucine, K^Lysine, L^Leudne, M=Mcthioninc, 
N=Asparagiuc, ^Proline, Q=GIutaminc, R-Arginine, S=*Serinc, 
"^Threonine, V«Valine, W=Tryptophan, Y^Tyrosine, 
X-=Unknown, *=Stop codofl, A=possiblc nudeotide deletion, 
\=possible nudeotide insertion 










CPTCGGMFANNTKFLDHIRRQTSLDQQHFQCSH 

CSKJU^ATERLLRDHMRNHVNHYKCPLCDMTCPL 

PSSLRNHMRFRHSEDKPFKCDCCDYSCKNLDDLQ 

KHLDTHSEEPAYRCDFENCTFSARSLCSDCSHYR 

KVHEGDSEPRYKCHVCDKCFTRGNNLTVHLRK 

KHQFKWPSGHPRFRYKEHEDGYMRLQLVRYES 

VELTQQLLRQPQEGSGLGTSLNESSLQGIILETVP 

GEPGRKEEEEEGKGSEGTALSASQDNPSSVIHW 

NQTNAQGQQEIVYYVLSEAPGEPPPVPEPPSGGI 

MEKLQGIAEEPEIQMV 


3220 

• 


A 


2760 

• 


745 


SLGIPSGNTRGTGLVLDGDTSYTYHLVCMGPEAS 
GWGQDEPQTWPTDHRAQQGVQRQGVSYSVHA 
YTGQPSPRGLHSENREDEGWQVYRLGARDAHQ 
; GRPTWALRPED GEDKEMKTYRLDAGD ADPRRL 
CDLERERWAVIQGQAVRKSSTVATLQGTPDHGD 
PRTPGPPRSTPUBENVVDREQIDFLAARQQFLSLE 
QANKGAPHSSPARGTPAGTTPGASQAPKAFNKP 
HIJVNGHVVPIKPQVKGVVREE^VRAVPTWAS 
VQWDDPGSLASVESPGTPKETPIEREIRLAQERE 
ADLREQRG1JIQATDHQELVEIPTRPLLTKLSLITA 
PRRERGRPSLYVQRDIVQETQREEDHRREGLHV 
GRASTPDWVSEGPQPGLRRALSSDSILSPAPDAR 
AADPAPEVRKVNRIPPDAYQPYLSPGTPQLEFSA 
FGAFGKP SSLSTAEAKAATSPKATMSPRHLSESS ! 
GKPLSTFCQEAS KPPRGCPQANRG WR WE YFRLR 
PLRFRAPDEPQQAQVPHVWGWEVAGAPALRLQ 
KSQSSDLLERERESVLRREQEVAEERRNALFPEV 
FSPTPDENSDQNSRSSSQASGITGSYSVSESPFFSPI 
HLHSNVAWTVEDPVDSAPPGQRKKEQWYAGIN 
PSDGINSEVLEAmVTRHKNAMAERWESRJYASE 
EDD 


3221 


A 


15 


478 


SRVFFFFFFFPAFKMSKRGRGGSSGAKFRISLGLP 
VGAVINCADNTGAKNLYnSVKGlKGRLNRLPAA 
GVGDMVMATVKKGKPELRKKVHPAVVIRQRKS 
YRRKDGXOTuYFEDNAGVIVNNKGEMKGSAITGP 
VAKECADLWPRIASNAGSIA 


3222 

• 

* 


A 


207 


1321 


PLIPLHPANRSPATMAELQEVQITEEKPLLPGQTP 

EAAKTHSVETPYGS V'l'KJ ' VYGTPKPKRPAILTYH 

DVGLNYKSCFQPLFQFEDMQEnQNFVRVHVDAP 

GMEEG AP VFPLG YQ YP SLDQL ADMIPC VLQ YLN 

FSTIIGVGVGAGAYILARYALNHPDTVEGLVLIN1 

DPN AKG WMDWAAHKLTGLTS SIPEMILGHLFSQ 

EEl^GNSELIQKYRMITHAPNLDNIELYWNSYNN 

RRDLNFERGGDITLRCPVMLWGDQAPHEDAVV 

ECNSKLDPTQTSFLKMADSGGQPQLTQPGKXTE 

AFKYFLQGMGYMASSCMTRLSRSRTASLTSAAS 

VDGKRJSRSRTLSQSSESGTLSSGPPGHTMEVSC 


3223 


A 


132 


1664 


SARRWGAAGAGPHGLHLRAHGPRPSVRTGLPSV 
GRQAAGAAMGRGWGFLFGLLGAVWLLSSGHGE 
EQPPETAAQRCFCQVSG YLDD CTCDVETIDRFNN 

YRLFPRLQKLLESDYFRYYKVNLKRPCPFWNDIS 

QCGRRDCAVKPCQSDEVPDGIKSASYKYSEEAN 

NLIEECEQAERLGAVDESLSEETQKAVLQWTKH 

DDSSDNFCEADDIQSPEAEYVDLLLNPERYTGYK 

GPDAWKIWNVIYEENCFKPQTIKRPLNPLASGOG 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G-Glycine» H->Histidine, 
I-isoleucine, K=Lysine, L=Leucine, {^Methionine, 
N~Aspanigine,P ies Proline l Q=Glutaniine, R=»Arginine,S=Serine, 
T=Threonine, V=VaJine, W«Tryptophao, Y=Tyrosine, 
X=Unkno\vn, *=Stop cod on,/= possible nucleotide deletion, 
N=pos$l ble nucleotide insertion 










TSEENTFYSWLEGLCVEKRAFYRLISGLHASINV 

Hl^ARYIXQETWlJEKKWGHNITEFQQRFDGILTE 

GEGPRRLKNLYFLYLIELRALSKVLPFFERPDFQL 

FTGNKIQDEENKMLLLEILHEIKSFPLHFDENSFF 

AGDKKEAHKXKEDFRLHFRNISRIMDCVGCFKC 

RLWGKLQTQGLGTALKILFSEKLIANMPESGPSY 

EFHLTRQEIVSLFNAFGRISYKCERIRKTSKNLLQ 

NIH 


3224 


A 


2 


803 


PGSTISWDRDAAGESGTRAASPSPSGSRTAGRLP 

SPSYSPLPAPSLFPPPPLPAPAASTMSAGGDFGNP 

LRKFKLVFLGEQSVGKTSLITRFMYDSFDNTYQA 

TIGIDFLSKTMYLEDRTVRLQLWDTAGQERFRSL 

IPSYIRDSTVAVVVYDITNLNSFQQTSKWIDDVRT 

ERGSDVIIMLVGNKTDLADKRQIHEEGEQRAKE 

I^VMFIETSAKTGYNVKQLFRRVASALPGMENV 

QEKSKEGMIDIKLDKPQEPPASEGGCSC 


3225 

• 


A 

• 


3 


5054 


PEVTKPSLSQPTAASPIGSSPSPPVNGGNNAKRVA 

VPNGQPPSAARYMPREVPPRFRCQQDHKVLLKR 

GQPPPPSCMLLGGGAGPPPCTAPGANPNNAQVT 

G ALLQSESGTAPDSTLGGAAASNY ANSTWGSG A 

SSNNGTSPNPIHI WDKVIVDG SDMEE WPCIASKD 

TESSSENTTDNNSASNPGSEKSTLPGSTTSNKGK 

GSQCQSAS SGNECNLG VWKSDPKAKS VQSSNST 

TENhWGLGNWRKVSGQDRIGPGSGFSNFNPNSN 

PSAWPALVQEGTSRKGALETDNSNSSAQVSTVG 

QTSREQQSKMENAGVNFVVSGREQAQIHNTDGP 

KNGNTNSLNLSSPNPMENKGMPFGMGLGNTSRS 

TDAPSQSTGDRKTGSVGSWGAARGPSGTDTVSG 

QSNSGNNGNNGKEREDSWKGASVQKSTGSKND 

SWDhnWRSTGGSWNFGPQDSNDNKWGEGNKM 

TSGVSQGEWKQPTGSDELKIGEWSGPNQPNSST 

GAWDNQKGHPLLENQGNAQAPCWGRSSSSTGS 

EVEGQSTGSNHKAGSSDSHNSGRRSYRPTHPDC 

QAVLQTLLSRTDLDPRVLSNTGWGQTQIKQDTV 

WDDBEVPRPEGKSDKGTEGWESAATQTKNSGG 

WGDAPSQSNQMKSGWGELSASTEWKDPKNTGG 

WNDYKNNNSSNWGGGRPDEKTPSSWNENPSKD 

QGWGGGRQPNQGWSSGKNGWGEEVDQTKNSN 

WESSASKPVSGWGEGGQNEIGTWGNGGNASLA 

SKGGWEDCKRSPAWNETGRQPNSWNKQHQQQ 

QPPQQPPPPQPEASGSWGGPPPPPPGNVRPSNSS 

WSSGPQPATPKDEEPSGWEEPSPQSISRKMDIDD 

GTSAWGDPNSYNYKNVNLWDKNSQGGPAPREP 

NLPTPMTSKSASDSKSMQDGWGESDGPVTGARH 

PSWEEEEDGGVWNTTGSQGSASSHNSASWGQG 

GKKQMKC SLKGGNNDS WMNPLAKQFSNMGLL 

SQTEDNPSSKMDLSVGSLSDKKFDVDKRAMNLG 

DFNDIMRKDRSGFRPPNSKDMGTTDSGPYFEKG 

GSHGLFGNSTAQSRGLHTPVQPLNSSPSLRAQVP 

PQFISPQVSASMLKQFPNSGLSPGLFNVGPQLSPQ 

QIAMLSQLPQIPQFQLACQLLLQQQQQQQLLQN 

QRKISQAVRQQQEQQLARMVSALQQQQQQQQR 

QPGMKHSPSHPVGPKPHLDNMVPNALNVGLPDL 

QTKGPIPGYGSGFSSGGMDYGMVGGKEAGTESR 

FKQWTSMMEGLPSVATQEANMHKNGATVAPGK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCysteioe, D=Aspartic Add, 
E=Glutamic Add, ^Phenylalanine, G=€lydne, H«Histidine, 
l»lsoleucine, K=Lyslne, L^Leudnc, M=Methionlne, 
N=Asparagine, P=Froline, Q=Glutamine, R^Arginine, S=Serine, 
T=*Threonine, V=*Valine, W«Tryptopban, Y«Tyroslne, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possibic nudeotidc insertion 










TRGGSPYNQFDIIPGDTLGGHTGPAGDSWLPAKS 

PPTNKIGSKSSNASWPPEFQPGVPWKGIQNIDPES 

DPYVTPGSVLGGTATSPIVDTDHQLLRDNTTGSN 

SSLNTSLPSPGAWPYSASDNSFTNVHSTSAKFPD 

YKSTWSPDPIGH^Q^HLSNK>IWK^^HISSRNTTPL 

PRPPPGLTNPKPSSPWSSTAPRSVRGWGTQDSRL 

ASASTWSDGGS VRPS YWL VLHNLTPQIDGSTLRT 

ICMQHGPLLTFHLNLTQGTALIRYSTKQEAAKAQ 

TALHMCVLGN'J "1 1L AEFATDDEVSRFLAQAQPPT 

PAATPSAPAAGWQSLETGQNQSDPVGPALNLFG 

GSTGLGQWSS SAGGSSG ADLAG ASL WGPPNYS S 

SLWGVPTVEDPHRMGSPAPLLPGDLLGGGSDSI 


3226 


A 


200 


1387 

* 


WWKRQDEQL SLQ VETLYLDSPAVIHLLSPTFLP 

PSSLPPI^QAO^SSSSACTlJDSFFPia-APWDSPQDC ! 

GFKDHQPLTLQALTVELARWTIJvILLLSTAMYG j 

AHAPLLALCHVDGRVPFRPSSAVLLTELTKLLLC 

AFSLLVGWQAWPQGPPPWRQAAPFALSALLYG 

ANNNLVIYLQRYMDPSTYQVLSNLKIGSTAVLY 

CLCLRHRLSVRQGLALLLLMAAGACYAAGGLQ 

VPGNTLPSPPPAAAASPMPLHTITLGLLLLILYCLI 

SGLSSVYTELLMKRQRLPLALQNLFLYTFGVLLN 

LGLHAGGGSGPGLLEGFSGWAALWLSQALNGL 

LMS A VMKHG SSITRLF WSCSL V VNA VLS A VLL 

RLQLTAAFFLATLLIGLAMRLYYGSR 


3227 


A 


1 


679 


RSTRARTRRPGLRAVPLPVGGFLGKMKWVWAL 

LLL AALGS GRAERDCRV S SFRVKENFDKARFS GT 

WYAMAKKDPEGLFLQDNIVAEFSVDETGQMSA 

TAKGRVRLLNNWDVCADMVGTFTDTEDPAKFK 

MKYWGVASFLQKGNDDHWIVDTDYDTYAVQY 

SCRLLNLDGTCADSYSFVFSRDPNGLPPEAQKJV 

RQRQEELCLARQYRLIVHNGYCDGRSERNLL 


3228 


A 


430 

• 


1104 


QQESPAAG AARMNCKEGTDSS CGCRGNDEKKM 

LKCVWGDGAVGKTCLLMSYANDAFPEEYVPT 

VFDHYAVTVTVGGKQHLLGLYDTAGQEDYNQL 

RPLSYPNTDVFLICFSVVNPASYHNVQEEWVPEL 

KDCMPHVPYVLIGTQ1DLRDDPKTLARLLYMKE 

KPLTYEHGVKLAKAIGAQCYLECSALTQKGLKA 

VFDEAILTIFHPKKKKKRCSEGHSCCSII 


3229 


A 


25 


722 


AISAGRSAKMQLKPMEINPEMLNKVLSRLGVAG 

QWRFVDVLGLEEESLGSVPAPACALLLLFPLTAQ \ 

HENFRKKQIEELKGQEVSPKVYFMKQTIGNSCGT 

IGLIHAVANNQDKLGFEDGSVLKQFLSETEKMSP 

EDRAKCFEKNEAIQAAHDAVAQEGQCRVDDKV 

NFHFILFNNVDGHLYELDGRMPFPVNHGASSEDT 

LLKDAAKVCREFTEREQGEVRFSAVALCKAA 


3230 


A 


282 


1479 


GDAATTACAPPDWFLGPRKLAAGPAGGGMLPR 

RLLAAWLAGTRGGGLLALLANQCRFVTGLRVR 

RAQQIAQLYGRLYSESSRRVLLGRLWRRLHGRP 

GHASALMAALAGVFVWDEERIQEEELQRSINEM 

KRLEEMSNMFQSSGVQHHPPEPKAQTEGNEDSE 

GKEQR WEMVMDKKHFKL WRRPITG THL YQ YRV 

FGTYTDVTPRQFFNVQLDTEYRKKWDALVDCLE 

VIERDVVSGSEVLHWVTHFPYPMYSRDYVYVRR 

YSVDQENNMMVLVSRAVEHPSVPESPEFVRVRS 

YESQMVIRPHKSFDENGFDYLLTYSDNPQTVFPR 
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SEO ID 
NO: 


Method 


Prvffirt*d 
m rcu i c icu 

beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


* rcu ic icu en a 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acifl sequence tA=vuanine c^Cysteine, J/^Aspartic Acid, 
E=Glutamic Acid, F=Fbenytalanine, C=Grycine. H«*Histidine, 
I=lsoleucine, K=Lysine, L*=Lcucine, M«Methionine, 
N=Asparagine, F^Proline, Q=Glutaraine, R«Arginine, S=Serine, 
T^Threonine, V=Vallne, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion* 
V=»possibIe nucleotide insertion 










YCVSWMVSSGMPDFLEKLHMATLKAKNMEIKV 
KDYISAKPLEMSSEAKATSQSSERKNEGSCGPAR 
BEYA 


3231 


A 


2117 


590 


FWEPPEAGASSPCAPGDPDMSFRKVVRQSKFRH 
WGQPVKNrX}CYEDIRVSR\0*WDSTFCAVNPKF 
L A VTVEA S GGGAFL VLPLSKTGRIDKA YPTVCGH 

TGPVLDroWCPH^EVlASGSEDCriVMVWQIPE 

NGLTSPLTEPVVVLEGHTKRVGIIAWHPTARNVL 

LSAGCDNWLIWNVGTAEELYRLDSLHPDLIYN ! 

VSWNHNGSLFCSACKDKSVRIIDPRRGTLVAERE 

KAHEGARPMRAIFLADGKVFTTGFSRMSERQLA 

LWDPENLEEPMALQELDSSNGALLPFYDPDTSV 

VYVCGKGDSSIRYraiTEEPPYIHFLNTFTSKEPQR 

GMGSMPKRGLEVSKCEIARFYKLHERJCCEPIVM 

TVPRKSDLFQDDLYPDTAGPEAALEAEEWVSGR 

DADPILISLREAYVPSKQRDLKISRRNVLSDSRPA 

MAPGSSHLGAPASTTTAADATPSGSLARAGEAG 

KLEEVMQELRALRALVKEQGDRJCRLEEQLGRM 

ENGDA 


3232 


A 


3 


718 


RLREDDRRGLPLSSPLWTEPPLSCCLPATYPADM 

GTAGAMQLCWVILGFLLFRGHNSQPTMTQTSSS 

QGGLGGLSLTTEPVSSNPGYIPSSEANRPSHLSST 

GTPGAGVPSSGRDGGTSRDTFQTVPPNSTTMSLS 

MREDATILPSPTSETVLTVAAFGVISFIVLLVVVVI 

ILVGWSLRFKCRKSKESEDPQKPGSSGLSESCST 

ANGEKDSITI.ISMKNINMNNGKQSLSAEKVL 


3233 


A 


3 


718 


RLREDDRRGLPLSSPLWTEPPLSCCLPATYPADM 

GTAGAMQLCWVILGFLLFRGHNSQPTMTQTSSS 

QGGLGGLSLTTEPVSSNPGYIPSSEANRPSHLSST 

GTPGAGVPSSGRDGGTSRDTFQTVPPNSTTMSLS 

MREDATILPSPTSETVLTVAAFGVISFIVILVVVVI 

ILVGWSLRFKCRKSKESEDPQKPGSSGLSESCST 

ANGEKDS1TLISMKNINMNNGKQSLSAEKVL 


3234 


A 


1169 


4292 


AGDCGRJLGVGGSEFPWEGSALGASPLPPICLQSR 

TWLLRAPAPAELGELEEVAAGRGDVWEPFLDSP 

GREESLQEASPRLADHGSSSGGG WEVKRSQRLR 

RGPSSPRRPYQDMEYERRGGRGDRTGRYGATDR 

SQDDGGENRSRDHDYRDMDYRSYPREYGSQEG 

KHDYDDSSEEQSAEDSYEASPGSETQRRRRRRH 

RHSPTGPPGFPRDGDYRDQDYRTEQGEEEEEEED 

EEEEEKASNIVMLRMLPQ AATEDDIRG QLQSHG 

VQAREVRLMKNKSSGQSRGFAFVEFSHLQDATR 

WMEANQHSLNILGQKVSMHYSDPKPKINEDWL 

CNKCGVQNFKRREKCFKCGVPKSEAEQKLPLGT 

RLDQQTLPLGGRELSQGLLPLPQPYQAQGVLAS 

QALSQGSEPSSENANDTnLRNLNPHSTMDSILGA 

LAPYAVLSSSNVRVIKDKQTQLNRGFAFIQLSTIE 

AAQLLQILQALHPPLTIDGKTINVEFAKGSKRDM 

ASNEGSRISAASVASTAIAAAQWAISQASQGGEG 

TWATSEEPPVDYSYYQQDEGYGNSQGTESSLYA 

HGYLKGTKGPGITGTKGDPTGAGPEASLEPGADS 

VSMQAFSRPQPGAAPGIYQQSAEASSSQGTAANS 

QSYTIMSPAVLKSELQSPTHPSSALPPATSPTAQE 

SYSQYPVPDVSTYQYDETSGYYYDPQTGLYYDP 

NSQYYYNAQSQQYLYWDGERRTYVPALEQSAD 
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S£QU> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D*Aspartic Acid, 
E-Glutamic Acid, ^Phenylalanine, OGIycine, H-Histidine, 
I-Isoleudne, K«Lysine, L^Lencine, M B Methionine, 
N=Asparagine, P»Proline, Q^GIutaminc, R=Arginine, S=Serine, 
^Threonine, V^Valinc, W«Tryptophan, Y-Tyrosine, 
X«Unknown, *-Stop codon, ^possible nucleotide deletion, 
^possible nncleotide insertion 










GHKETGAPSKEGKEKKEKHKTKTAQQIAKDME 

RWARSLNKQKENFKNSFQPISSLRDDERRESATA 

DAGYAILEKKGALAERQHTSMDLPKLASDDRPS 

PPRGLVAAYSGESDSEEEQERGGPEREEKLTDW 

QKLACLLCRRQFPSKEALIRHQQLSGLHKQNLEI 

HRRAHLSEKELEALEKNDMEQMKYRDRAAERR 

EKYGIPEPPEPKRRKYGGISTASVDFEQPTRDGLG 

SDNIGSRMLQAMGWKEGSGLGRKJCQGIVTPIEA 

QTRVRGSGLGARGSSYGVTSTESYKETLHKTMV 

TRFNEAQ 


3235 


A 


3 


1217 


PSFLNTGLGPTALGVLGGAGAGLMSNPSPQVPEE 

EASTSVCRPKSSMASTSRRQRRERRFRRYLSAGR 

LVRAQALLQRHPGLDVDAGQPPPLHRACARHD 

APALCLLLRLGADPAHQDRHGDTALHAAARQG 

PDAYTDFFLPLLSRCPSAMGIKNKDGETPGQILG 

WGPP WD S AEEEEEDD A SKERE WRQKLQGELED 

EWQEVMGRFEGDASHETQEPESFSAWSDRLARE 

HAQKCQQQQREAEGSCRPPRAEGSSQSWRQQEE 

EQRLFRERARAKEEELRESRARRAQEALGDREP 

KPTRAGPREEHPRGAGRGSLWRFGDVPWPCPGG 

GDPEAMAAALVARGPPLEEQGALRRYLRVQQV 

RWHPDRFLQRFRSQIETWELGRVMGAVTALSQA 

LNRHAEALK 


3236 


A 


3 

* 


1416 


GPASGMAEPTSDFETPIGWHASPELTPTLGPLSDT 

APPRDRWMFWAMLPPPPPPLTSSLPAAGSKPSSE 

SQPPMEAQSLPGAPPPFDAQILPGAQPPFDAQSPL 

DSQPQPSGQPWNFHASTSWYWRQSSDRFPRHQK 

SLNPAVKNSYYPRKYDAKFTDFSLPPSRKQKKK 

KRKEPVFHFFCDTCDRGFKNQEKYDKHMSEHTK 

CPELDCSFTAHEKTVQFHWRNMHAPGMKK1KLD 

TPEEIARWREERRKNYPTLANIERKKKLKLEKEK 

RGAVLTTTQYGKMKGMSRHSQMAKIRSPGKNH 

KWKNDNSRQRAVTGSGSHLCDLKLEGPPEANA 

DPLGVLINSDSESDKEEKPQHSVIPKEVTPALCSL 

MSSYGSLSGSESEPEETPIKTEADVLAENQVLDSS 

APKSPSQDVKATVRNFSEAKSENRKKSFEKTNPK 

REKRLSQLSNVIRTKNTPSISLGNASSSGHST 


3237 


A 


3806 


2204 


FVGEQEGGCEAGAGRGAQTYPGEAGERWFGRR 

RRRGRWSRKKMSLKSERRGIHVDQSDLLCKKG 

CGYYGNPAWQGFCSKCWREEYHKARQKQIQED 

WELAERLQRBEEEAFASSQSSQGAQSLTFSKFEE 

KKTNEKTRKVTTVKXFFSASSRVGSKKEIQEAKA 

PSPSINRQTSIETDRVSKEFIEFLKTFHKTGQEIYK 

QTKLFLEGMHYKRDLSIEEQSECAQDFYHNVAE 

RMQTRGKVPPERVEKIMDQIEKYIMTRLYKYVF 

CPETTDDEKKDLAIQKRIRALRWVTPQMLCVPV 

NEDIPEVSDMVVKAITDIIEMDSKRVPRDKLACIT 

KCSKHH^AIKITKNEPASADDFLPTLIYIVLKGNP 

PRLQSNIQYTTRFCNPSRLMTGEDGYYFTNLCCA 

VAFIEKLDAQSLNLSQEDFDRYMSGQTSPRKQEA 

ESWSPDACLGVKQMYKNLDLLSQLNERQERIMN 

EAK3QJSKDUDWTDGIAREVQDIVEKYPLEIKPP 

NQPLAAIDSENVENDKLPPPLQPQVYAG 


3238 


A 


1373 


449 


VLSVCPTGVFRPAPCRMAFMKKYLLPILGLFMA 
YYYYSANEEFRPEMLQGKKVIVTGASKGIGREM 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted cud 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A° Ala nine OCysteine, D=Aspartic Acid, 
E*=€lutamic Acid, F»Pbenyiatanine, G^GIycine, H=Histidine, 
I=»Isoleudnc, K=Lysine, LHLeudne, M^Metbionlnc, 
N^Asparagine, P«»Pro1lne, Q=GIutamine, R=Arginine, S=Serioe, 
IVThreonlne, V«Valine, WeTryptophan, Y«Tyrosine, 
X=Unbnown, *«Stop codon, A=possib!e nucleotide deletion, 
^possible nucleotide insertion 










AYHLAKMGAHWVTARSKETLQKWSHCLELG 
AASAHYIAGTMEDMTFAEQFVAQAGKLMGGLD 
MLILNHITOTSLNIJHDDIHHVRKSMEV^ 

VLTVAALPMLKQSNGSIVVVSSLAGKVAVPMVA 

AYSASKFALDGFFSSIRKEYSVSRVNVSrrLCVLG 

LTDTETAMKAVSGIVHMQAAPKEECALEIIKGGA 

LRQEEVYYDSSLWTIlJLIRNPCRKn FFLYSTSYN 
MDRFINK 


3239 


A 


213 


422 


ERTMQLEIKVAL>IFI1FYLYNKLLW/QPLKKK*EA 

HWYPDKPLKGSGFHT/GEMVDPVGELAAKRSGL 
TVED 


3240 


A 


1255 


1425 


HES YHVNPNLCNP VAPTSG AHSIG* KWPSWLGA 
VAHSCNPSTLVGRGGRITRGQELR 




A 

A 


lot 


547 


PAGIGRSTAKTPGTPGSLEMENLKS G V YPLKEAS 
GCPGADRNLLVYSFYEKGPLTFRDVAIEFSLEEW 
QCIJ)TA<^DLYRKVMLENYRNLVFLAGIAVSKP 
DLITCLEQGKEPWNMKRHAMVDQPPGR 


3242 


A 


50 


241 

* 


PLPARGKSTLPATFCSPSAPELASMSVVPPNRSQT 
GWPRGVTQFGNKY1QQTKPLTLERTENL 


3243 


A 


380 


702 


fvayl^klpffsqvcijassemfftisrknmsqkls 

llllvfgliwglmllhytfqqprhqssvklreqi 

ldlskryvkalaeenkntvdvengasmagygk 
itveyf 


3244 


A 


37 

• 


1391 


VXMIXjRMMRSMRLREEESPGPSHTASCLCGSAP 

CILCSCCPASRNSWSRLIFTFFLFLGVLVSinVlLSP 

GVESQLYKLPWVCEEGAGIPTVLQGfflDCGSLLG 

YRAVYRMCFATAAFFFFFTLLMLCVSSSRDPRA 

AIQNGF WFFKFLILVGLTVG AFYIPIXiSFThn WFY 

FGWGSFLFJDLIQLVLL1DFAHSWNQRWLGKAEE 

CDSRAWYAGLFFFTLLFYLLSIAAVALMFMYYT 

EPSGCHEGKVFISLNLTFCVCVSIAAVLPKVQDA 

QPNSGLLQASVITLYTMFVTWSALSSIPEQKCNP 

HLPTQLGNETVVAGPEGYETQWWDAPSIVGLIIF 

LLCTLFISLRS SDHRQ VNSLMQTEECPPMLDATQ 

QQQQVAACEGRAFDNEQDGVTYSYSFFHFCLVL 

ASIJiVMMTLTNWYKPGETRKMISTWTAVWVKI 

CASWAGLLLYL 




A 




42o 


Ssi^UWiilJlJJilJLSLAJU^lTGMFVASHRKMRAHQV 
LTTLLLFVTTSVASENASTSRGCGLDLLPQYVSLC 
DLDAIWGIVVEAAAGAGALITLLLMLILLVRLPF 
FKEKEKKSPVGLHFLFLLGTLGP 


3246 


A 


3 


515 


HEVCGSGCCCHCCAGGPVARQKALPRLRGVMS 
RFLNVLRSWLVN1VSI1AMGNTLQSFRDHTFLYEK 
LYTGKPNLVNGLQARTFGIWTLLSSVIRCLCAIDI 
HNKTLYHnXWTFlXALGHFLSELFVYGTAAPTT 

CYSJl APT MVAWCiTT (~l\A1 \/m UVT T3\rco\rGT> rxtrrr 
vJ V lsJ\rLfjyl V Aor olL/VjJVLLf V VJL^tv i J_,ll V Erl* V oKA^lvK. 

RN 


3247 


A 


1 


932 


ERLCTPCMQSKIYSYMSPNKCSGMRFPLQEENSV 

THHEVKCQGKPLAGIYRKREEKRNAGNAVRSA 

MKSEEQKIKDARKGPLVPFPNQKSEAAEPPKTPP 

SSCDSTNAA1AKQALKKPIKGKQAPRKKAQGKT 

QQNRKLTDFYPVRRSSRKSKAELQSEERKRIDELI 

ESGKEEGMKIDLIDGKGRGV1ATKQFSRGDFVVE 

YHGDLffilTDAKIOOBALYAQDPSTGCYMYYFQY 

LSKTYC VDATRETNRJLXjRLINHS KCGNCQTKLH 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A to Alanine OCysteine, D^Aspartic Acid, 
E=Glutamic Add, ^Phenylalanine, G«Gtydne, H-Hlstidine, 
I=Iso leucine, K^Lysioe, L=*Leucine, M=MethionIne, 
N»Asparagjne, P=ProIine, Q^GIutamine, R»Arginf ne, S=Serine, 
T=»Threonine, V*Vallne, W=Tryptophan, Y«Tyrosine, 
X=Unkoown, *«Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 

• 










DIDGVPHLTHASRDIAAGEELLYDYGDRSKASIE 
AHPWLKH 


3248 


A 


3 


870 


PGSTISCSELKGTQCRATAGSRGRRPPMTCWLRG 

VTATFGRPAEWPGYLSHLCGRSAAMDLGPMRK 

SYRGDREAFEETHLTSLDPVKQFAAWFEEAVQC 

PDIGEANAMCLATCTRDGKPSARMLLLKGFGKD 

GFKFFTNFESRKGKELDSNPFASLVFYWEPLNRQ 

VRVEGPVKKLPEEEAECYFHSRPKSSQIGAWSH 

QSSVIPDREYLRKKNEELEQLYQDQEVPKPKSW 

GGYVLYPQVMEFWQGQTNRLHDRIVFRRGLPTG 

DSPLGPMTHRGEEDWLYERLAP 


3249 


A 


43 


1210 


TRVGRGESGLKMEVKPPPGRPQPDSGRRRRRRG 

EEGHDPKEPEQLRKLFIGGLSFETTDDSLREHFEK 

WGTLTDCVVMRDPQTKRSRGFGFVTYSCVEEV 

DAAMCARPHKVDGRVVEPKRAVSREDSVKPGA 

HLTVKKIFVGGIKEDTEEYNLRDYFEKYGKIET1E 

VMEDRQSGKKRGFAFVTFDDHDTVDKJVVQKY 

HIWGHNCEVKKALSKQEMQSAGSQRGRGGGS 

GNFMGRGGNFGGGGGNFGRGGNFGGRGGYGG 

GGGGSRG SYGGGD GG YNGFGGDGGNYGGGPG 

YSSRG G YGGGGPG YGNQGGG YGGGGGYDGYN 

EGGNFGGGNYGGGGNYNDFGNYSGQQQSNYGP 

MKGGSFGGRSSGSPYGGGYGSGGGSGGYGSRRF 


3250 


A 


32 


1175 


VAGRGDMAALRDAEIQKDVQTYYGQVLKRSAD 

LQTNGCVTTARPVPKHIREALQNVHEEVALRYY 

GCGLVIPEHLENCWILDLGSGSGRDCYVLSQLVG 

EKGHVTGIDMTKGQVEVAEKYLDYHMEKYGFQ 

ASNVTFIHGYIEKLGEAGIKNESHDIVVSNCVTNL 

VPDKQQVLQEAYRVLKHGGELYFSDVYTSLELP 

EEIRTHKVLWGECLGGALYWKELAVLAQKIGFC 

PPRLVTANLITIQNKELERVIGDCRFVSATFRLFK 

HSKTGPTKRCQVIYNGGITGHEKELMFDANK1FK 

EGEIVEVDEETAAILKNSRFAQDFLIRPIGEKLPTS 

GGCSALELKDU1DPFKLAEESDSMKSRCVPDAA 

GGCCGTKKSC 


3251 


A 


32 


1175 


VAGRGDMAALRDAEIQKDVQTYYGQVLKRSAD 

LQTNGCVTTARPVPKHTREALQNVHEEVALRYY 

GCGLVIPEHLENCWILDLGSGSGRDCYVLSQLVG 

EKGHVTGIDMTKGQVEVAEKYLDYHMEKYGFQ 

ASNVTFIHGYIEKLGEAGIKNESHDIVVSNCVrNL 

VPDKQQVLQEAYRVLKHGGELYFSDVYTSLELP 

EEIRTHKVLWGECLGGALYWKELAVLAQKIGFC 

PPRLVTANLITIQNKELERV1GDCRFVSATFRLFK 

HSKTGPTKRCQVIYNGGITGHEKELMFDANFTFK 

EGETVEVDEETAAILKNSRFAQDFLIRPIGEKLPTS 

GGCSALELKDIITDPFKLAEESDSMKSRCVPDAA 

GGCCGTKKSC 


3252 


A 


1 


574 


PLGSOTAPALRVMVQAWYMDDAPGDPRQPHRP 

DPGRPVGLEQLRRLGVLYWKLDADKYENDPELE 

KIRRERKYSWMDDTICKDKLPhry^EKIKMFYEE 

HLHLDDEIRYDLDGSGYFDVRDKEDQWIRIFMEK 

GDMVTLPAGIYHRFTVDEKNYTKAMRLFVGEPV 

WTAYNRPADHFEARGQYVKFLAQTA 


3253 


A 


2 


984 


ARAAAHCGICRLVRWWRKRRSVMGIQTSPVLLA 
SLG VGLVTLLGLA VG S YLVRRSRRPQVTLLDPNE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acio resiauc oi 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCystdne, D=AspartIc Add, 
EeGlutamic Add, F=Phenylalanine, G^CIycint, H=Histidinc, 
I=Isoleucine, K^Lyslne, L-=Leudne, M=Methionine, 
N=Asparagine, P=*Prollne, Q=Glutamfne, R=>Argininc, S=Serine f 
T^Threonlne, V=Valine, W«Tryptophan, Y»Tyrosinc, 
X=Un known, *«^top codon, A=possiblc nudeotide deletion, 
v=possible nudeotide insertion 


■ 








KY1XRIJLDKTTVSHNTKRFRFALPTAHHTLGLPV 

GKHIYLSTRIDGSLV1RPYTPVTSDEDQGYVDLVI 

KVYLKGVHPKFPEGGKMS Q YLDSLK VGD WEF 

RGPSGLLTYTGKGHFN1QPNKKSPPEPRVAKKLG 

MIAGGTGITPMLQLIRAILKVPEDPTQCFLLFANQ 

TEKDnLREDLEELQARYPNRFKLWFTLDHPPKD 

WAYSKGFVTADMIREHLPAPGDDVLVLLCGPPP 

MVQLACHPNLDKLGYSQKMRFTY 


3254 


A 


1 


968 

» 


LQSAGEGVTHVLILLESPARPVAAVTQVQRRRY 

HRLSDMSMLAERKRKQKWAVDPQNTAWSNDD 

SKFGQRMLEKMGWSKGKGLGAQEQGATDHIKV 

QVKNNHLGLGATINNEDNWIAHQDDFNQLLAEL 

NTCHGQETTDSSDKKEKKSFSLEEKSKISKNRVH 

YMKFTKGKDLSSRSKTDLDCIFGKRQSKKTPEG 

DASPSTPEENETTTTSAFTIQEYFAKRMAALKNK 

PQVPVPGSDISETQVERKRGKKRNKEATGKDVE 

SYLQPKAKRHTEGKPERAEAQERVAKKKSAPAE 

EQLRGPCWDQSSKASAQDAGDHVQPA 


3255 


A 


173 


439 


GSAAMKVKIKCWNGVATWLWVANDENCGICR 

MAFNGCCPDCKVPGDDCPLVWGQCSHCFHMHC 

ILKWLHAQQVQQHCPMCRQEWKFKE 


3256 


A 


2 


377 


TAARRRQKGTAARRRQKGTLEEVVLPPRSCRVF 
WIHSGTTMSKVSFKITLTSDPRLPYKVLSVPESTP 
FTAVLKFAAEEFKVPAATSAIITNDGIGINPAQTA 
GNVFLKHGSELRIIPRDRVGSC 


3257 


A 


3 


1454 


GCSAAAAGAGSGPWAAQEKQFPPALLSFFIYNPR 

FGPREGQEENKJLFYHPNEVEKNEKIRNVGLCEAI 

VQFTRTFSPSKPAKSLHTQKNRQFFNEPEENFWM 

VMVVRNPI1EKQSKDGKPVIE YQEEELLDKVYS S 

VLRQCYSMYKLFNGTFLKAMEDGGVKIXKERL 

EKFFHRYLQTLHLQSCDLLDIFGGISFFPLDKMTY 

LKIQSFIhniMEESLNIVKYTAFLYNDQLIWSGLEQ 

DDMRILYKYLTTSLFPRHIEPELAGRDSPIRAEMP 

GNLQHYGRFLTGPLNLNDPDAKCRFPKIFVNTD 

DTYEELHLIVYKAMSAAVCFMIDASVHPTLDFC 

RRLDSIVGPQLTVLASDICEQFNINKRMSGSEKEP 

QFKFIYFNHMNLAEKSTVHMRKTPSVSLTSVHPD 

LMK1LGDINSDFTRVDEDEEIIVKAMSDYWVVG 

KKSDRRELYVCLNQKNANLDEVNEEVKKLCATQF 

NNIFFLD 


3258 


A 


113 


1558 


APRGCSMPHRKKKPFIEKKKAVSFHLVHRSQRD 

PLAADESAPQRVLLPTQKIDNEERRAEQRKYGVF 

FDDDYDYLQHLKEPSGPSELEPSSTFSAHNRREEK 

EETLVIPSTGIKXPSSVFASEFEEDVGLLNKAAPV 

SGPRLDFDPDIVAALDDDFDFDDPDNLLEDDFIL 

QANKATGEEEGMDIQKSENEDDSEWEDVDDEK 

GDSNDDYDSAGLLSDEDCMSVPGKTHRAIADHL 

FWSEETKSRFIBYSMTSSVMRRNEQLTLHDERFE 

KFYEQYDDDEIGALDNAELEGSIQVDSNRLQEVL 

NDYYKEKAENCVKLNTLEPLEDQDLPMNELDES 

EEEEMnVVLEEAKEKWDCESICSTYSNLYNHPQ 

LIKYQPKPKQIRISSKTGIPLNVLPKKGLTAKQTE 

RIQMINGSDLPKVSTQPRSKNESKEDKRARKQAI 

KEERKERRVEKKANKLAFKLEKRRQEKELLNLK 
KNVEGLKL 



295 



WO 01/57190 PCT/US01/04098 



SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=AJanlne OCysteioe, D=Aspartic Add, 
E=Glutaroic Add, ^Phenylalanine, G=Glydne, H=Histidi ne, 
I=lsoleuclne, K=>Lysine, l^Leudne, M e Methionine, 
N-A5paragine, P=Proline, Q=Glutamine, R=Arginint, S-Serine, 
T«Threonine, V»Va«ne, W«Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V= possible nudeotide insertion 


3259 


A 


3 


964 


QMEPGNDTQISEFLLLGFSQEPGLQPFLFGLFLSM 

YLVTVLGNLLIILATISDSHLHTPMYFFLSHLSFA 

DICVTSTTIPKMLMNIQTQNKVITYIACLMQMyF 

FILFAGFENFLL S VMA YDRF V A ICHPLHYM VIMN 

PHLCGLLVIJVSWTMSALYSLLQILMVVRLSFCT 

ALEIPHFFCELNQV1QLACSDSFLNHMVIYFTVAL 

LGGGPLTGILYSYSKnSSIHAISSAQGKYKAFSTC 

ASHLSWSLFYGAILGVYLSSAATRNSHSSATAS 

VMYTVVTPMLOTFIYSLRNKDDCRALGIHLLWGT 

MKGQFFKKCP 


3260 


A 


34 


2573 


IPFLKSCCCCCLFDFPPPPLDQVQEEECEVERVTE 

HGTPKPFRKFDSVAFGESQSEDEQFENDLFl'DPP 

KWQQLVSREVLLGLKPCEIKRQEVINELFYTERA 

HVRTLKVLDQVFYQRVSREGELSPSELRKJFSNLE 

DILQLHIGLNEQMKAVRKRNETSVIDQIGEDLLT 

WFSGPGEEKLKHAAATFCSNQPFALEMIKSRQK 

KDSRFQTFVQDAESNPLCRRLQLKDIIPTQMQRL 

TKYPLLLDNIATYTE WPTEREK V KKAA DHC RQIL 

NYVNQAVKEAENKQRLEDYQRRLDTSSLKLSEY 

PNVEELRNLDLTKRKMIHEGPLVWKVNRDKTID 

LYTLLLEDILVLLQKQDDRLVLRCHSKILASTAD 

SKOTTSPVIKLSTVLVRQVATDNKALFVISMSDN 

GAQIYELVAQTVSEKTVWQDLICRMAASVKEQS 

TKPIPLPQSTPGEGDNDEEDPSKLKEEQHGISVTG 

LQSPDRDLGLESTLISSKPQSHSLSTSG KSEVRDL 

FVAERQFAKEQHTDGTLKEVGEDYQIAIPDSHLP 

VSEERWALDALRNLGLLKQLLVQQLGLTEKSVQ 

EDWQHFPRYRTASQGPQTDSVIQNSENIKAYHSG 

EGHMPFRTGTGDIATCYSPRTSTESFAPRDSVGL 

APQDSQASNILVMDHMIMTPEMPTMEPEGGLDD 

SGEHFFDAREAHSDENPSEGDGAVNKEEKDVNL 

RISGNYLILDGYDPVQESSTDEEVASSLTLQPMT 

GIPAVESTHQQQHSPQNTHSDGAISPFTPEFLVQQ 

RWGAMEYSCFEIQSPSSCADSQSQ1MEYWKIEA 

DLEHLKKVEES YTILCQRLAG S ALTDKHSDKS 


3261 


A 


1 


2100 


AVEFAEGALTMAPWPELGDAQPNPDKYLEGAA 

G QQPTAPDKSKETNKTDNTEAP VTKIELLPS YST 

ATLDDEPTEVDDPWNLPTLQDSGIKWSERDTKGK 

ILCFFQGIGRLILLLGFLYFFVCSLDILSSAFQLVG 

GKMAGQFFSNSSIMSNPLLGLVIGVLVTVLVQSS ! 

STSTSIVVSMVSSS1XTVRAAIPIIMGANIGTSITNT 

IVALMQVGDRSEFRRAFAGATVHDFFNWLSVLV 

IXPVEVATHYLEnTQLIVESFHFKNGEDAPDLLK 

VITKPFTKJLIVQLDKXVISQIAMNDEKAKNKSLV 

KIWCKTFTNKTQIKVTWSTANCTSPSLCWTDGI 

QNWTMKNVTYKEKLAXCQHIFVNFHLPDLAVGT 

n.Ln^LLVLCGCLIMIVK!LGSVLKGQVATVIKKT 

INTDFPFPFAWLTGYLAILVGAGMTFrVQSSSVFT 

SALTPLIGIGVITIERAYPLTLGSNIGTTTTAILAAL 

ASPGNALRSSLQIALCHFFFNISGILLWYPIPFTRL 

PIRMAKGLGNISAKYRWFAVFYLIIFFFLIPLTVFG 

LSLAGWRVLVGVGVPWFniLVLCLRLLQSRCPR 

VLPKKLQNWNFLPLWMRSLKPWDAVVSKFTGC 

FQMRCCCCCRVCCRACCLLCGCPKCCRCSKCCE 

DLEEAQEGQDVPVKAPETFDNITISREAQGEVPA 
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SEQID 
NO: 


Method 

* 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino odd sequence (A«Alanine OCystdue, B=A spar tic Add, 
E=Glutamie Add, F»Pbeoylalan!ne, G=Glydne, H=Histidlne, 
I=Isoleudne, K=Lysine, I^=Leudne, M==Methlonine, 
N»Asparogine, P»Proline f Q=Glutamine, R»Arginine, S=Serine, 
T-Threonine, V^Valine, W^Tryptophan, Y»Tyrosine, 
X-Unknown, *-Stop codon, /-possible nucleotide ddction, 
\=possible nucleotide insertion 










SDSKTECTAL 


3262 


A 

• 


30 

• 


1377 


SQQGSQPHRQGPPSLLTAPHSLDLPALPPGPRGS 

QGKLRRVLVPMSVKPSWGPGPSEGVTAVPTSDL 

GEIHNWTELLDLFNHTLSECHVELSQSTKRVVLF 

ALYLAMFVVGLVENLLVICVNWRGSGRAGLMN 

LY1LNMAIADLGIVLSLPVWMLEVTLDYTWLWG 

SFSCRFTHYFYFVhnviYSSIFFLVCLSVDRYVTLTS 

ASPSWQRYQHRVRRAMCAGIWVLSAIIPLPEVV 

HIQLVEGPEPMCLFMAPFETYSTWALAVALSTTI 

LGFLLPFPLITVFNVLTACRLRQPG QPKSRRHCLL 

LCAYVAVFVMCWLPYHVTLLLXTLHGTHISLHC 

HLVHLLYFFYDVIDCFSMLHCVINPILYNFLSPHF 

RGRLLN A V VHYLPKDQTKAGTCA SSSSCSTQHSI 

nTKGDSQPAAAAPHPEPSLSFQAHHLLPNTSPISP 

TQPLTPS 


3263 


A 


1 


919 


QARSPSVAAMASPQLCRALVSAQWVAEALRAP 

RAGQPLQLLDASWYLPKLGRDARREFEERHIPG 

AAFFDIDQCSDRTSPYDHMLPGAEHFAEYAGRL 

G VGAATHV VIYDASDQGL YSAPRV WWMFRAFG 

HHAVSLLDGGLRHWLRQNLPLSSGKSQPAPAEF 

RAQLDPAFIKTYEDIKENLESRRJFQVVDSRATGR 

FRGTEPEPRDGIEPGHIPGTVNIPFTDFLSQEGLEK 

SPEEIRHLFQEKKVDLSKPLVATCGSG VTACHV A 

LGAYLCGKPDVPIYDGSWVEWYMRARPEDV1SE 

GRGKTH 


3264 

• 


A 


1 


1398 


ARRSTPRTAPRASATRSAAGTMREIVHIQAGQCG 

NQIGAKFWEVISDEHGIDPTGSYHGDSDLQLERI 

NVYYNEAAGNKYVPRAILVDLEPGTMDSVRSGP 

FGQIFRPDNFVFGQSGAGNNWAKGHYTEGAELV 

DSVLDWRKESESCDCLQGFQLTHSLGGGTGSG 

MGTLLISIOREEYPDRIMNTFSVMPSPKVSDTVVE 

PYNATLSVHQLVENTDETYSIDNEALYDICFRTL 

KXTTPTYGDLNHLVSATMSGVTTCLRFPGQLNA 

DLRKLA VNMVPFPRLHFFMPGFAPLTSRG SQQY 

RALTVPELTQQMFDSKNMMAACDPRHGRYLTV 

AAIFRGRMSMKEVDEQMLNVQNKNSSYFVEWEP 

NNVKTAVCDIPPRGLKMSATFIGNSTAIQELFKRI 

SEQFTAMFRRKAFLHWYTGEGMDEMEFTEAES 

NMNDLVSEYQQYQDATADEQGEFEEEEGEDEA 


3265 


A 


265 


862 


WWEDARVLGPFHPEEEGHWVMTPSEGARAGTG 

RELEMLJDSIJLALGGLVLLRDSVEWEGRSLLKAL 

VKKSALCGEQVHILGCEVSEEEFREGFDSDINNR 

LVYHDFFRDPLNWSKTEEAFPGGPLGALRAMCK 

RTDPWVTTALDSLS WLLLRLPCTTLCQVLHA V S 

HQDSCPGETPPSLFPLIHLPLPRSVPLFLSTLE 


3266 


A 


2 


884 


AAGAGADGREPASERASRAEPPAVAMGQNDLM 

GTAEDFADQFLRVTKQYLPHVARLCLISTFLEDG 

IRMWFQWSEQRDYIDTTWNCGYLLASSFVFLNL 

LX3QLTGCVLVLSRNFVQYACFGLFGIIALQTIAYS 

ILWDLKFLMRNLALGGGLLLLLAESRSEGKSMF 

AGVPTMRESSPKQYMQIX5GRVLLVLMFMTLLH 

FDASFFSIVQNIVGTALMILVAIGFKTKLAALTLV 

VWLFAINVYFNAFWTIPVYKPMHDFLKYDFFQT 

MSVIGGLLLWALGPGGVSMDEKKKEW 


3267 


A 


802 


1011 


ASTFCSAWKRRSTAALWWSGSRASRSHPRELGP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaolne OCysteine, D=Aspartic Acid, 
E=Glutamfc Add, F»Pbenyla1anine, G=Grycine, HHHlstidine, 
I=l5oleueine, KHLyslne, L^Leudne, M a MetbionIae, 
N°Asparagine, P=ProIlne» Q^Glutaratue, R«Argiutne, S=Strlnc, 
IWThreonine, V-Valine, W^Tryptophan, Y-Tyrosine, 
X=t)nknowo, *=Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 










LO^GTAALSIRSMDVLSLFLEHGKLVFASGLSP 
RA 


3268 


A 


490 


679 


EDAWITNPSLSNARSTPSKPLCYTVLKEGQVVGV 
KTTKASNTREKJLRPESERRMVKSFGDEVT 


3269 


A 


2 


796 


G STHASG ARPSLKKARSQRGRPLPSRALPS AHKD 

MTTNAGPLHPYWPQHLRLDNFVPNDRPTWHILA 

GLFSVTGVLVVTTWLLSGRAAVVPLGTWRRLSL 

CWFAVCGFIHLVIEGWFVLYYEDLLGDQAFLSQ 

LWKEYAKGDSRYILGDNFTVCMETITACLWGPL 

SLWWIAFLRQHPLRFILQLWSVGQIYGDVLYF 

LTEHRDGFQHGELGHPLYFWFYFVFMNALWLV 

LPGVLVLDAVKHLTHAQSTLDAKATKAKSKKN 


3270 


A 


17 


229 


GDTGPQILMSYLDSVASKLLQMVKKLSQSFCSNF 
KYLTKYSRKQVSDEIKKSRRTVESNPIFFIOCNKKI 

Q 


3271 


A 


419 


553 


IQSGLSLCFADLSETPEGRAGVPGCPHSCDGVAS 
GRPCSPSSAG 


3272 


A 


1211 


1450 


FQFIQIELLNILQSLIRNQTQSPYNTTAYPAIDSV1T 
ILPF SFS CFFHTKCFGLS IFPSVIFTLHVYFILTL V VF 
YCC 


3273 


A 


59 


1562 


QAWSLQVAJLSPFFFPASPSNSFAAAVPQLLFPELP 

LPHVPGQESAKRRSARRFLIMSELTKELMELVW 

GTKSSPGLSDTTFCRWTQGFVFSESEGSALEQFEG 

GPCAVIAPVQAFLLKKLLFSSEKSSWRDCSQEEQ 

KELLCHTLCDILESACCDHSGSYCLVSWLRGKTT 

EETASISGSPAESSCQVEHSSALAVEELGFERFHA 

L1QKRSFRSLPELKDAVLDQYSMWGNKFGVLLF 

LYSVLLTKGIENIKNEBEDASEPLIDPVYGHGSQS 

LINLLLTGHAVSNVWDGDRECSGMKLLGIHEQA 

AVGFLTLMEALRYCKVGSYLKISKIPYLDCLASE 

THLTVFFAKDMALVAPEAPSEQARRVFQTYDPE 

DNGFIPDS1XEDVMKj\LDLVSDPEYINLNDCNKL 

DPEGLGIILLGPFLQEFFPDQGSSGPESFTVYHYN 

GLKQSNYNEKVMYVEGTAWMGFEDPMLQTD 

DTPIKRCLQTKWPYIELLWTTDRSPSLN | 


3274 


A 


186 


1358 


RWHRFFKS S AFWPAEVKQPRGGPKTGSRKEGA 

GSRAPQPWRSFCGSVGAEGRMEKLRLLGLRYQ 

EYVTRHPAATAQLETAVRGFSYLLAGRFADSHE 

LSELWSASN1XVLLNDGILRKELRKKLPVSLSQ 

QKLLTWLSVLECVEVFMEMGAAKVWGEVGRW 

LVIALIQLAKAVLRMLLLLWFKAGLQTSPPIVPL 

DRETQAQPPDGDHSPGNHEQSYVGKRSNRWRT 

LQNTPSLHSRHWGAPQQREGRQQQHHEELSATP 

TPLGLQETIAEFLYIARPLLHLLSLGLWGQRSWK 

PWLLAGWDVTSLSLLSDRKGLTRRERRELRRR 

TBLLLYYLLRSPFYDRFSEARILFLLQLLADHVPG 

VGLVTRPLMDYLPTWQKIYFYSWG 


3275 


A 


575 


759 


SWSASSCKCC^mOCTEQIPDCEQPPASSMPERPS 
HESQPTPQMMPLSAPSRAEELGQRPG 


3276 


A 


7 


258 


KAAGHRLLLAAGHPSMPSSDCLLWEGSLELRPL 

QHISSLLVLVSTTCLFAFPRVPIAFESKSCUYHCH 

CAFTVRHYMCSSHTG 


3277 


A 


9 


2221 


KLGVEPEEEGGGDDEEDAEAWAMELADVGAAA 

SSQGVHDQVLPTPNASSRVIVHVDLDCFYAQVE 

MISNPELKDKPLGVQQKYLVVTCNYEARKLGVK 
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SEQH> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanlne C=Cysteine, D=Aspartic Acid, 
E=Glutamic Add, F<°PhenylaJanlne, G«Glycine, H=Histidine, 
t=Isoleucine, K=Lysine, L=Leucine, M=Mcthlonine, 
N=Asparagine, P^ProIine, Q=G!utaraine, R^Arginlne, S=Sedne, 
T-Threonine, V=Valine, W=Tryptophan f Y=Tyrosine, 
X=Unknown, *=Stop ccdon, A=possib)e nucleotide deletion, 
\=possible nucleotide insertion 










KLMNVRDAKEKCPQLVLVNGEDLTRYREMSYK 

VTELLEEFSPWERLGFDENFVDLTEMVEKRLQQ 

LQSDELSAVTVSGHVYNNQSINLLDVLHIRLLVG 

SQIAAEMREAMYN QLGLTGCAG VASNKLLAKL 

VSGVFKPNQQTVLLPESCQHLIHSLNHIKEIPGIG 

YKTAKCLEALGrNSVRDLQTFSPKILEKELGISVA 

QRIQKLSFGEDNSPVILSGPPQSFSEEDSFKKCSSE 

VEAKNKIEELLASLLNRLCQDERKPHTVRLIIRRY 

SSEKHYGRESRQCPIPSHVIQKLGTGNYDVMTPM 

VDILMKLFRNMV>TVKMPFHLTLLSVCFCNLKAL 

NTAKKGLIDYYLMPSLSTTSRSGKHSFKMKDTH 

MEDFPKDKETNRDFLPSGIUESTRTRESPLDTTNF 

SKEKDINEFPLCSLPEGVDQEVFKQLPVDIQEEDL 

SGKSREKFQGKGSVSCPLHASRGVLSFFSKKQM 

QDIPINPRDHLSSSKQVSSVSPCEPGTSGFNSSSSS 

YMSSQKDYSYYLDNRLKDERISQGPKEPQGFHF 

TNSNPAVSAFHSFPNLQSEQLFSRNHTTDSHKQT 

VATDSHEGLTENREPDSVDEKITFPSDIDPQVFYE 

LPRAVQKELLAEWKRTGSDFHIGHK 


3278 


A 

• 


1 


876 


GLRLHVDLVEKPRTGIMAAETRNVAGAEAPPPQ 

KRYYRQRAHSNPMADHTLRYPVKPEEMDWSEL 

YPEFFAPLTQNQSHDDPKDKKEKRAQAQVEFAD 

IGCGYGGLLVELSPLFPDTLILGLEIRVKVSDYVQ 

DRIRALRAAPAGGFQNIACLRSNAMKHLPNFFY 

KGQLTKMFFLFPDPHFKRTKHKWRIISPTLLAEY 

AYVLRVGGLW 1 1 IDVLELHD WMCTHFEEHPLF 

ERVPLEDLSEDPVVGHLGTSTEEGKKVLRNGGK 

NFPA1FRRIQDPVLQAVTSQTSLPGH 


3279 


A 


82 
- 


2929 


TRTKRRLGREKAMA SPPRG WGCGELLLPFMLLG 

TLCEPGSGQIRYSMPEELDKGSFVGNIAKDLGLE 

PQELAERGVRIVSRGRTQLFALNPRSGSLVTAGRI 

DREELCAQSPLCVVNFNILVENKMKIYGVEVEII 

DINDNFPRFRDEELKVKVNENAAAGTRLVLPFA 

RDADVGVNSLRSYQLSSNLHFSLDWSGTDGQK 

YPELVLEQPLDREKETVHDLLLTALDGGDPVLSG 

TTHIRVTVLDANDNAPLFTPSEYSVSVPENIPVGT 

RLLMLTATDPDEGINGKLTYSFRNEEEKISETFQL 

DSNLGEISTLQSLDYEESRFYLMEWAQDGGAL 

VASAK V WTVQD VNDNAPE VILTSLTS SISEDCL 

PGTVIALFSVHDGDSGENGEIACSIPRNLPFKLEK 

SVDNYYHLLTTRDLDREETSDYN1TLTVMDHGT 

PPLSTESHIPLKVADVNDNPPNFPQASYSTSVTEN 

NPRGVSIFSVTAHDPDSGDNARVTYSLAEDTFQG 

APLSSYVS1NSDTGVLYALRSFDYEQLRDLQLWV 

TASDSGNPPLSSNVSLSLFVLDQNDNTPEILYPAL 

DTT^rtGT/^A/UT Anno a t>t*/>x/i x/fi/^Jii a ^iT\T/r\fi/^r\ 

rl L>uo 1 0 VcLArKoAiirOYLVTKV VAVDKDSGQ 

NAWLSYRLLKASEPGLFAVGLHTGEVRTARALL 

DRDALKQSLVVAVEDHGQPPLSATFTVTVAVAD 

RIPDDLADLGSIKTP1DPEDLDLTLYLWAVAAVS 

CVFLAFVIVLLVLRLRRWHKSRLLQAEGSRLAG 

VPASHFVGVDGVRAFLQTYSHEVSLTADSRKSH 

LIFPQPKYADTLLSEESCEKSEPLLMSDKVDANK 

EERRVQQAPPNTDWRFSQAQRPGTSGSQNGDDT 

GTWPKNQFDTEMLQANDLASASEAADGSSTLGG 

GAGTMGLSARYGPQFTLQHVLQGELGSDYRQN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«AlanJne C= Cysteine, D=»Aspartic Add, 
E=Glutamlc Add, F=Pheny lata nine, OClyrine, H»Histidine, 
I=Isoieucine, K=Lysine, 1^= Leucine, M^Methionint, 
N=Asparoglne, P^Proline, Q=Glutamlne, R=Argtnine, S=Serine, 
T»Threonine, V«Valine, W~Tryptophan, Y«Tyrosine» 
X=Unknown, *=Stop codon, /=possibIe nudeotide deletion, 
V=possiblc nudeotide insertion 










VYIPGSNATLTNAAGKRDGKAPAGGNGNKKKS 
GKKEKK 


3280 

• 


A 


149 


1288 

m 


GTSQMSSHKGSVVAQGNGAPASNREADTAELAE 

LGPLLEEKGKRVIANPPKAEEEQTCPVPQEEEEE 

VRVLTLPLQAHHAMEKMEEFVYKVWEGRWRVl 

PYDVLPDWLKDNDY1XHGHRPPMPSFRACFKSIF 

RJHTKrGNIWTHLLGFVLFXFLGILTMLimNfMYF 

MAPLQEKVVFGMFFLGAVLCLSFSWLFHTVYCH 

SEKVSRTFSKLDYSGIALLIMGSFVPWLYYSFYCS 

PQPRLIYLSIVCVLGISAirVAQWDRFATPKHRQT 

RAGVFIXjLGLSGWFTMHFTIAEGFVKATTVGQ 

MGWFFLMAVMYITGAGLYAARIPERFFPGKFDI 

WFQSHQIFHVLWAAAFVHFYGVSNLQEFRYGL 

EGGCTDDTLL 


3281 


A 


1 


557 


RPRRRQPSFSCRVLVLEDPPCFRFTNSMNQEKLA 

KLQ AQ VRIG GKGTARRKKKV VHRTATADDKKL 

QSSLKKLAVNNIAGIEEVNMn^ 

VQASLSANTFAITGHAEAKPITEMLPGILSQLGAD 

SLTSLRKLAEQFPRQVLDSKAPKPEDIDEEDDDV 

PDLVENFDEASKNEAN 


3282 

* 


A 


155 

• 


1139 


HALGRRGGSQELSAAACGCFALRLRAPGSGRPA 

LAPGAAAFAGLGGAPRFPPRGSAAGRTMLLKEY 

RICMPLTVDEYKIGQLYMISKHSHEQSDRGEGVE 

WQNEPFEDPHHGNGQFTEKRVYLNSKLPSWAR 

AVVPKIFYVTEKAWNYYPYTITEYTCSFLPKFSIH 

IETKYEDNKGSND llb'DNEAKDVEREVCFIDIACD 

EIPERYYKESEDPKHFKSEKTGRGQLREGWRDSH 

QPIMCSYKLVTVKFEVWGLQTRVEQFVHKVVR 

DILLIGHRQAFAWVDEWYDMTMDDVREYEKN 

MHEQTNIKVCNQHSSPVDDIESHAQTST 


3283 


A 


159 


547 


DCSKLNQQVEVQESEWRLTEAKGPTMGKESGW 
DSGRAAVAAWGGWAVGTVLVALSAMGFTSV 
GIAASSIAAKMMSTAAIANGGG VAAG SL VAILQS 
VGAAGLSVTSKVIGGFAGTALGAWLGSPPSS 


3284 


A 


227 


637 


TSNSLLRPDRMSVMDLANTCSSFQSDLDFCSDCG 

SVLPLPGAQDTVTCmCGFMNVRDFEGKVVKTS 

WFHQLGTAMPMSVEEGPECQGPWDRRCPRCG 

HEGMAYHTRQMRSADEGQTVFYTCTNCKFQEK 

EDS 


3285 

• 


A 


123 


1535 

• 


HRLSYDEAFAMANDPLEGFHEVNLASPTSPDLL 

GVYESGTQEQTTSPSVIYRPHPSALSSVPIQANAL 

DVSELPTQPVYSSPRRLNCAEISSISFHVTDPAPCS 

TSGVTAGLTKLTTRKDNYNAEREFLQGA1T1KAC 

DGSDDIFGLSTDSLSRLRSPSVLEVREKGYERLKE 

BLAKAQRELKLKDEECERLSKVRDQLGQELEEL 

TASIJ^EAHKMVREANIKQATAEKQLKEAQGKJ 

DVLQAEVAALKTLVLSSSPTSPTQEPLPGGKTPF 

KKGHTRNKSTSSAMSGSHQDLSVIQPIVKDCKEA 

DLSLYNEFRLWKDEPTMDRTCPFLDKIYQEDIFP 

CLTFSKSELASAVLEAVENNTLSIEPVGLQPIRFV 

KASAVECGGPKKCALTGQSKSCKHRIKLGDSSN 

YYYISPFCRYRTTSVCNFK1 YIRYIQQGLVKQQDV 

DQMFWEVMQLRKEMSLAKLGYFKEEL 


3286 


A 


3 


589 


GPSQSMAAGELEGGKPLSGLLNALAQDTFHGYP 
GrTEEIJLRSQLYPEVPPEEFRPrXAKMRGILKSIAS 
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SEQtD 
NO: 


Method 


Predicted 

besiDnins 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 

OCDtidc 

sequence 


Amino acid sequence (A«Alanine OCysteine, D=Aspartk Acid, 

F*=fl lilts m\f A riH pt=Phpnvlfilanifi» P-r'tH.u. H->Ui r »Jj:n> 

**- v> iMittuiiv /*.tiU) f— * (iciijriuianincy vv^^*iycine, Jn a HJStiQine, 
Ielsoleucine, K=Lysine t L» Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T-Threonlne, V=Valine, W»Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 

V^nn^sihle niirtpnrirlp insertion 










ADMDFNQLEAFLTAQTKKQGGITSDQAAVISKF 
WKSHKTKIRESLMNQSRWNSGLRGLSWRVDGK 
SQSRHSAQIHTPVAIIELELGKYGQESEFLCLEFD 
EVKVNQILKTLSEVEESISTLISQPN 


3287 


A 


50 


390 


LGAMAKHHPDLIFCRKQAGVA1GRLCEKCDGKC 
VIODSYVRPCTLVRICDECNYGSYQGRCVICGGP 
G VSDA YYCKECTIQEKDRDG CPKIVNLG SSKTDL 
FYERKKYGFKKR 


3288 


A 


3 


428 


RlUFFRFRPCESLCGDMKLLTHhlLLSSHVRGVGS 

RGFPLRLQATEVRICPVEFNFNFVARMIPKVEWS 

AFLEAADNLRLIQVPKGPVEGYEENEEFLRTMH 

HLLLEVEVIEGTLQCPESGRMFPISRGIPNMLLSE 

EETES 


3289 


A 


1 


1743 


AGCCRDTRFPTPRGPGSLCHNFCRSAACTVTRT1 

HGSPREDTGTPRSREMMFQDSVAFEDVAVSFTQ 

EEWA1XDPSQKNLYRDVMQETFKNLTSVGKTW 

KVQNmDEYKNPRRNLSLMREKLCESKESHHCG 

ESFNQIADDMLNRKTLPGITPCESSVCGEVGTGH 

SSLNTHIRADTGHKSSEYQEYGENPYRNKECKK 

AFS YLDSFQSHDKACTKEKPYDGKECTETFISH S 

CIQRHRVMHSGDGPYKCKFCGKAFYFLNLCLIH 

ERIHTGVKPYKCK.QCGKAFTRSTT1J > VHERTHTG 

VNADECKECGNAFSFPSEIRRHKRSHTGEKPYEC 

KQCGKVFISFSSIQYHKMTHTGEKPYECKQCGK 

AFRCGSHLQKHGRTHTGEKPYECRQCGKAFRCT 

SDLQRHEKTHTEDKPYGCKQCGKGFRCA SQLQI 

HERTHSGEKPHECKECGKVFKYFSSLRMERTHT 

GEKPHECKQCGKAFRYFS SLHIHERTHTGDKPYE 

CKVCGKAFTCSSSIRYHERTHTGEKPYECKHCGK 

AFISNYIRYHERTHTGEKPYQCKQCGKAFIRASS 

CREHERTHTINR 


3290 

! ♦ 


A 


2 


1350 


GRPRSSSDNRNFLRERAGLSSAAVQTRIGNSAAS 

RRSPAARPPVPAPPALPRGRPGTEGSTSLSAPAVL 

WAVAVVVVWSAVAWAMANYIHVPPGSPEVP 

KLNVTVQDQEEHRCREGALSLLQHLRPHWDPQE 

VTLQLFTDGITNKLIGCYVGNTMEDVVLVRIYGN 

KTELLVDRDEEVKSFRVLQAHGCAPQLYCTFNN 

GLCYEFIQGEALDPKHVCNPAIFRLIARQLAKIHA 

IHAHNGWTPKSNLWLKMGKYFSLIPTGFADEDIN 

KIUO^DIPSSQILQEEMTWMKEILSNLGSPVVLCH 

NDLLCKN1IYNEKQGDVQFIDYEYSGYNYLAYDI 

GNHFNEFAGVSDVDYSLYPDRELQSQWLRAYLE 

AYKEFKGFGTEVTEKEVEILFIQVNQFALASHFF 

WGLWALIQAKYSTIEFDFLGYAIVRFNQYFKMK 

PEVTALKVPE 


3291 


A 


102 


839 


PEAQTSAVLAREKGHLPTMRHEAPMQMASAQD 

ARYGQKDSSDQNFDYMFKLLIIGNSSVGKTSFLF 

RYADDSFTSAFVSTVGIDFKVKTVFKNEKRIKLQI | 

WDTAGQERYRTITTAYYRGAMGFILMYDITNEE 

SFNAVQDWSTQKTYSWDNAQVILVGNKCDME 

DERVISTERGQHLGEQLGFEFFETSAKDNINVKQ 

TFERLVDnCDKMSESLETDPAITAAKQNTRLKET 
PPPPQPNCAC 


3292 


A 


2 


4136 


DRPPWNSRVDDFVTNLIHLSSKGHISPAKDTSLQ 
QRTPAEMSPVUffYVRPSGHEGAASGHTRRKLQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E*=GIutamic Acid, F~Phenylalanf ne, OGIydne, R»Histidine, 
Islsoleudne, K«Lysine, L=Leuctne, M^Methionine, 
N=Asparagine, P=Pro!ine, Q=Glutamlne t R»Arginine, S^Serine, 
T=Threonine, V=Valine, W»Tryptophan, Y«»Tyrosine, 
X=Unknown r *=Stop codon, /= possible nucleotide deletion, 
V^possible nucleotide insertion 






■ 




GKLPELQGVETELCYNVNWTAEALPSAEETBCKL 
MWLFGCPLLLDDVARESWLLPGSNDLLLEVGPR 
LOTSTPTSThm^SVCRATGLGPVDRVETTRRYRLS 
FAHPPSAEVEAIALATLHDRMTEQHFPHPIQSFSP 
ESMPEPLNGPINILGEGRLALEKANQELGLALDS 
WDU5FYTKRFQELQRNPSTVEAFDLAQSNSEHS 
. RHWFrTCGQLHVDGQKLVHSLFESIMSTQESSNP 
NNVLKFCDNSSAIQGKEVRFLRPEDPTRPSRFQQ 
QQGLRHVVFTAETONFPTGVCPFSGATTGTGGRI 
RDVQCTGRGAHVVAGTAGYCFGNLHIPGYNLP 
WEDLSFQYPGNFARPLEVAIEASNGASDYGNKF 
GEPVLAGFARSLGLQLPDGQRJREWIKPIMFSGGI 
GSMEADfflSKEAPEPGMEWKVGGPVYRIGVGG 
GAASSVQVQGDNTSDLDFGAVQRGDPEMEQKM 
NRVIRACVEAPKGNPICSLHDQGAGGNGNVLKE 
LSDPAGAUYTSRFQLGDPTLNALEIWGAEYQESN 
ALLLRSPNRDFLTHVSARERCPACFV GTTTGDRRI 
VLVDDRECPVRRNGQGDAPPTPPPTPVDLELEW 
VLGKMPRKEFFLQRKPPMLQPLALPPGLSVHQA 
LERVLRLPA VASKRYLTNKVDRS VGGLVAQQQC 
VGPLQTPLADVAWALSHEELIGAATALGEQPV 
KSLLDPKVAARLAVAEALTNLVFALVTDLRDVK 
CSGNWMWAAKLPGEGAALADACEAMVAVMA 
ALGVAVDGGKDSLSMAARVGTETVRAPGSLV1S 
AYAVCPDITATVTPDLKHPEGRGHLLYVALSPG 
QHRLGGTALAQCFSQLGEHPPDLDLPENLVRAFS 
ITQGLLKDRLLCSGHDVSDGGLVTCLLEMAFAG 
NCGLQVDVPVPRVDVLSVLFAEEPGLVLEVQEP 
DLAQVLKRYRDAGLHCLELGHTGEAGPHAMVR 
VSVNGAWLEEPVGELRALWEETSFQLDRLQAE 
PRCVAEEERGLRERMGPSYCLPPTFPKASVPREP 
GGPSPRVAILREEGSNGDREMADAFHLAGFEVW 
DVTMQDLCSGAIGLDTFRGVAFVGGFSYADVLG 
SAKGWAAAVTFHPRAGAELRRFRKRPDTFSLGV 
CNGCQLLALLGWVGGDPNEDAAEMGPDSQPAR 
PGLLLRHNLSGRYESRWASVRVGPGPALMLRG 
MEGAVLPVWSAHGEGYVAFSSPELQAQBEARGL 
APLHWADDDGNPTEQYPLNPNGSPGGVAGICSC 
DGRHLA VMPHPERA VRP WQ WA WRPPPF DTLTT 
SPWLQLFINARNWTLEGSC 


3293 


A 


65 


642 


GVRGFWAGTMASRAGPRAAGTDGSDFQHRERV 

AMHY QMSVTLKYEIKXLrYVHLWWLIXVAKMS 

VGHLRLLSHDQVAMPYQWEYPYLLSILPSLLGLL 

SFPRNNISYLVLSMISMGLFSIAPLIYGSMEMFPA 

AQQLYRHGKAYRFLFGFSAVSIMYLVLVLAVQV 

HAWQLYYSKKLLDSWFTSTQEKKHK 


3294 


A 


35 


1821 


SQRSCPRSPSSPAPPWARCSNPDSRTGGVPVPRA 

WSAGGPALGLMAAPVRLGRKRPLPACPNPLFVR 

WLTCWRDEATRSRHHTRFVFQKALRSLRRYPLP 

LRSGKEAKILQHFGDGLCRMLDERLQRHRTSGG 

DHAPDSPSGENSPAPQGRLAEVQDSSMPVPAQP 

KAGGSGSYWPARHSGARVILLVLYREHLNPNGH 

HFLTKEELLQRCAQKSPRVAPGSARPWPALRSLL 

HRNLVLRTOQPARYSLTPEGLELAQKLAESEGLS 

LLNVGIGPKEPPGEETAVPGAASAELASEAGVQQ 
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NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCystelne, D=Aspartic Acid, 
E=Glutamic Acid, F-Pbcnylalanine, G<=Grycine, HMfistidine, 
I=>Isoleucine, K-Lysine, JL=»Leucine, M-Methionfne, 
N»Asparaglne, P»Pro)ine, Q=GIutamine, R=Arginine, S^erine, 
T=Threomne, V-Valine, W=-Tryptophan, Y»T>rosine, 
X«Unknown, *=Stop cod on, /-possible nucleotide deletion, 
\=possib1e nucleotide insertion 










QPLELRPGEYRVLLCVDIGETRGGGHRPELLREL 

QRLHVTrTTVRK£HVGDFVWVAQETNPRDPANP 

GELVLDHI VERKRLDDLCS SIIDGRFREQKFRLKR 

CGI^RRVYLVEEHGSVHNLSLPESTLLQAVTNTQ 

VIDGFFVKJITADIKESAAYLALLTRGLQKLyQGH 

TLRSRPWGTPGNPESGAMTSPNPLCSLLTFSDFN 

AGAKNKAQSVREVFARQLMQVRGVSGEKAAA 

LVDRYSTPASLLAAYDACATPKEQETLLSTDCCG 

RLQKNLGPALSRTLS QL YC S Y GPLT i 


3295 


A 


2 


1115 


EFHPHTQVSGLLTPQLQEPDVWSPSRGQPVSLHL 

PGKGAPEVKEMAWWKSW1EQEGVTVKSSSHFN 

PDPDAETLYKAMKGIGTNEQAIIDVLTKRSNTQR 

QQ1AKSFKA QFGKDLTETLKSELSGKFERLI V AL 

MYPPYRYEAKELHDAMKGLGTKEGVIIEILASRT 

KNQLREIMKAYEEDYGSSLEEDIQADTSGYLERI 

LVCLLQGSRDDVSSFVDPALALQDAQDLYAAGE 

KmGTOEMKFimcni^ATHLLRVFEEYEKlANK 

SIEDSDCSETHGSLEEAMLTWKCTQNLHSYFAE 

RLYYAMKGAGTRDGTLIRN1VSRSE1DLNLEKCH 

FKKMYGKTLSSMIMEDTSGDYKNALLSLVGSDP 


3296 


A 

m 4 


1 


838 


GTRGGVGPGDNGGVEAGAKPGAAAIPLRGDGS i 

GETGPGRVAPGEVRGSPRGHVAGPEGPREVLFFF 

FLPSSKPASEVINEYSWKVDFLKGMLQAEKLTSS 

SEKALANQFLAPGRVPTTARERVPATKTVHLQS 

RARYTSEMRSELL GTDS AEPEMDVRKRTG VAG S 

QPVSEKQSAAELDLVLQRHQNLQEKLAEEMLGL 

ARSLKTNTLAAQSVIKKDNQTLSHSLKMADQNL 

EKLKTESERLEQHTQKSVNWLLWAML1IVCFIFIS 

MILFIRJMPKLK 


3297 


A 


46 


617 


HKQPAGFLGLWLGTETYTISFPGPETFGLGLSHA 
TGIPG SPACRQPVVGLHSLHNYRMAMVSAMS W 
VLYLWISACAMLLCHGSLQHTFQQHHLHRPEGG 
TCEVIAAHRCCNKNRIEERSQTVKCSCLPGKVAG 
TTRNRPSCVDASIVIGKWWCEMEPCLEGEECKTL 
PDNS G WMC ATGNKIKTTRIHPRT 


3298 


A 


157 


748 


IQPPDPRKMTLAAYKEKMKELPLVSLFCSCFLAD 

PLNKS S YKYEADTVDLN WCVISDME VIELNKCT 

SGQSFBVILKPPSFDGVPEFNASLPRRRDPSLEEIQ 

KKLEAAEERRKYQEAELLKHLAEKREHEREV1Q 

KAIEENNOTlKMAKEKXAQKlVffi 

AMLERLQEKDKHAEEVRKNKELKEEASR 


3299 


A 


5 

• 


892 

• 


TQLPAPLSGVLSRLQLGSGAPLLTVWQETAGVA 

GGAPRRRTPVTMWRLLARASAPLLRVPLSDSWA 

LLPASAG\OCTLLPVPSFEDVSIPEKPKLRFIERAPL 

VPKVRREPKNLSDIRGPSTEATEFTEGNFAILALG 

GGYLHWGHFEMMRLTTNRSMDPK^ 

APFKPITRKSVGHRMGGGKGAIDHYVTPVKAGR 

LWEMGGRCEFEEVQGFLDQVAHKLPFAAKAVS 

RGTLEKMRKDQEERERNNQNPWTFERIATANML 

GIRKVLSPYDLTHKGKYWGKFYMPKRV 


3300 


A 


2 


1847 


FVAGGPRGSGSAAETMPEIRVTPLGAGQDVGRS | 

CILVSIAGKKVMLIX^GMHMGFNDDRRFPDFSYI 

TQNGRLTDFLDCVnSHFHLDHCGALPYFSEMVG 

YDGPIYMTHPTQAICPILLEDYRKIAVDKKGEAN 

FFTSQMIKX)CMKKVVAVHLHQTVQVDDELEIKA 
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SEQ H) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A s Alanine OCysteioe, D=>Aspartic Acid, 
E=Glutemic Add, ^Phenylalanine, G»Glycine, H«Histidine, 
l^lsoleudne, K~Lysine, L^Leudne, M**Mcthtonine, 
N=Asparagine t P=Proline, Q=Glutamlne, R«Arginine, S^Serine, 
T»Threonioe, V«VaIine, W^ryptophnn, Y»Tyroslne, 
X=Unknown, *«Stop codon, /= possible oudeotide deletion, 
^possible nucleotide insertion 










YYAGHVLGAAMFQKVGSESVVYTGDYNMTPD 

RHLGAAWEDKCRPNLLITESTYATTIRDSKRCRE 

RDFLKKVHETVERGGKVLIPVFALGRAQELCILL 

Kl'FVVERMNLKWIYFSTGLTEKANHYYKLFIPWT 

NQKIRKTFVQRNMFEFKHIKAFDRAFADNPGPM 

VWATPGM1J1AGQSLQIFRKWAGNEKNMVIMP 

GYCVQGTVGHKILSGQRKLEMEGRQVLEVKMQ 

VEYMSFSAHADAKGIMQLVGQAEPESVLLVHGE 

AKKMEFLKQKIEQELRVNCYMPANGETVTLPTS 

PSIPVGISLGLLKI^MAQGLLPEAKKPRLLHGTLI 

MKDSNFRLVSSEQALKELGLAEHQLRFTCRVHL 

HDTRKEQETALRVYSHLKSVLKDHCVQHLPDGS 

VTVESVLLQAAAPSEDPGTKVLLVSWTYQDEEL 

GSFLTSLLKKGLPQAPS 


3301 


A 


2 


349 


CIRTEPAAAFRRLGALSGAAALGFASYGAHGAQ 
FPDAYGKJELFDKANKHHFLHSLALLGVPHCRKP 
LWAGLLLASGTTLFCTSFYYQALSGDPSIQTLAP 
AGGTLLLLGWLALAL 


3302 


A 


59 


1184 


LRRNCSALGGLFQTIISDMKGSYPVWEDFINKAG 

KLQSQLRTTWAAAAFLDAFQKVADMATNTRG 

GTREIGSALTRMCMRHRSIEAKLRQFSSALIDCLI 

NPLQEQMEEWKKVANQLDKDHAKEYKJCARQEI 

KKKSSDTLKLQKKAKKGRGDIQPQLDSALQDVN 

DKYLLLEETEKQAVRKALIEERGRFCTFISMLRP 

VIEEEISMLGEITHLQTISEDLKSLTMDPHKLPSSS 

EQVILDLKGSDYSWSYQTPPSSPSTTMSRKSSVC 

SSLNSVNSSDSRSSGSHSHSP SSHYRYRS SNLAQQ 

APVRLSSVSSHDSGFISQDAFQSKSPSPMPPEAPN 

QRRKEKREPDPNGGGPTTASGPPAAAEEAQRPRS 

M 


3303 


A 


511 

• 


958 


AGRGGPGKPVSWSSGPGSPGQTQRRSWVKSTRG 

HSSLLPPSQDFVAGLSVILRGTVDDRLNWAFNLY 

DLNKDGQTK£EMLDIMKS1YDMMGKYTYPALR 

EEAPREHVESFFQKMDKNKDGVVTIEEFIESCQK 

DENIMRSMQLFDNV1 


3304 


A 


40 


432 


ISEAASGAFQAR*FYQM\LEQKTDALGKQSVNRG 
FTKDKTLSSIFNIEMVKEKTAEEIKQIWQQYFAA 
KJDTVYAVIPAEKFDLIWNRAQSCPTFLCALPRRE 
GYEFFVGQWTGTELHFHCTYKYSDPEGKA 


3305 


A 


2 


483 

■ 


LDACSTGPYSRSTHASADAWADAWVVVVLKW 
GMTLFLLYFPQIFNKSNDGFTTTRSYGTVSQIFGS 
RSPSPNGFITTRSYGTVCPKDWEFYQARCFFLIHL 
*\S S WNES WDFCKGKG CTLATVDNSETLKLLHDL 
HDAEKNYIALP YRS SKYMSTCN GTF 


3306 


A 


2 


872 

* 


TLSSACLIGDAWKELTIVAGAVSNQLLVWYPAT 

ALADNKPVAPDRRISGHVGIIFSMSYLESKGLLA 

TASBDRSVRIWKGGDLRVPGGRVQNIGHCFGHS 

ARVWQVKLLENYLISAGEDCVCLVWSHEGEELQ 

AFRGHQGRGIRAIAAHERQAWVITGGDDSGIRL 

WHLVGRGYRGLG/DLGSLLQ VP* * ARYTQGCDS 

GWLLATAGSD*YRGPVSL*RRGQVLGAAARG*T 

FPVLLPAGGSSWSRGLRIVCYGQWGRSCQGCPH 

QHSNCCCGPDPVSWEGAQLELGPAWL 


3307 


A 


2 


927 


RTSRVEKGLRKAGAAVTMESDEWFSQALPANTS 
AQKAELIALTXJAIRWGKDINVKrDSRYAFATVH 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

Co first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
neotide 
sequence 


Amino odd sequence (A=AIanlne OCysteine, D=Aspartic Add, 
E=G1 atomic Acid, F°Pbenylalanine, G=Glycine, H»Histidine, 
I»Isoleudne, K=Lysine, L»Leudnc, M^Methionlne, 
N»Asporagine, P»=Proline t Q=Glntamine, R«=Arginine, S=Serine, 
TsTbreonine, V-Valine, W=Tryptophan, Y=»Tyrosine, 
X=Unkno\vo, *«=Stop codon, /"possible nudeotide ddetion, 

V=noQcihl# nnrteofide insertion 










VRG AICQEIUa,LTSAEKAIKNKNPPSSKPNRSSS\F 

WGTTCDQVNAKQGPKPSPGHRLRRNLPGEKWEI 

DFTK VKPH QAG YKYLL VLVDTFS G WTEAFATK 

NETVNMVVKFLLNEIIPRHGLPVAIGSDNGPAFA 

LSIV*SVSKALNIQWKLHCAYRPQSSGQVERMNC 

TLKNTLTKLILETGVNWVSLLPLALLRVRCTPYW 

AGFLPFEIMYGRVLPILPKLRDAQLAKISQTNLLQ 

YLQSP 


3308 


A 


490 


1077 


NSPSLDFNDNEDIPTELSDSSDTHDEGEVQAFYE 

DI^GRQYVNE VFNFS VDKLYDLLFTNS PFQRDF 

MEQRRFSDIIFHPWKKEENGNQSRVIPYTITLTNP 

LEHKTATVRETQTMYKASQESECYV1DAEVLTH 

DVPYHDYFYTINRYTLTRVARNKSRIJIVSTELRY 

RKQPWGLVKTFDEKNFWSGLEDYFRHL 


3309 


A 


490 


1077 


NSPSLDrTTONEDIPTELSDSSDTHDEGEVQAFYE 

DLSGRQYVNEVFNFSVDKLYDLLFTNSPFQRDF 

MEQRRFSDIIFHPWKKEENGNQSRVIPYTITLTNP 

LEHKTATVRETQTMYKASQESECYVTOAEVLTH 

DWYHDYFYTINRYTLTRVARNKSRLRVSTELRY 

RKQPWGLVKTFDEKNFWSGLEDYFRHL 


3310 

/ 


A 


2 


1198 


SPLCHPGLSRER/S*SEAKLRSGRYC*KRQVEAPL 

♦RPGL* TMAASDTERDGL APEKTSPDRDKKKEQS 

EVSVSPRASKHHYSRSRSRSRERKRKSDNEGRKH 

RSRSRSKEGRRHESKDKSSKKHKSEEHNDKEHSS 

DKGRERLNSSENGEDRHKRKERKS SRGRSHSRS 

RSRERRHRSRSRERKKSRSRSRERKKSRSRSRER 

KKSRSRSRERKRRIRSRSRSRSRHRHRTRSRSRTR 

SRSRDRKKRIEKPRRFSRSLSRTPSPPPFRGKNTA 

MDAQEALARRLERAKKLQEQREKEMVEKQKQQ 

EIAAAAAATGGSVLNVAALLASGTQVTPQIAMA 

AQMAALQAKALAETG1AVPSYYNPAAVNPMKF 

AEQEKKRKML WQGKKEGDKS QS AGNMGKN 


3311 


A 


177 


4 


PIQIPPRITPPRPSPHLLTPRTGSSPPPPRAPSPPHPT 
PGPAHDFPPLSAVLSGHTKT 


3312 


A 

• 


3 


426 


LESPRH*PPCWGPLIWALTVSSVPSPTPELSCBLKS 

P/RPACPV/PGLWPSLLSPAPPQSSGPLLGLSPCPG 

AGQWPSPLSPAPPPSSDPLSGLSPCPGAGPRSSPVS 

ASAPCRAVPLSPRRLTWPPHLQVGILBPTGRPWK 

NL 


3313 


A 


162 


2 


QLQNLASRGCL* SQLLRRLRRENRLNPGGGGCSE 
IAPXCTPAWVTQRDFFRKKK 


3314 


A 


162 


2 


QLQNLASRGCL* SQLLRRLRRENRLNPGGGGCSE 
IAPVCTPAWVTQRDFFRKKK 


3315 


A 


466 


1 


PRKRESWWGERLP/PRGFPPAAEDAPAPGWKGR 

KHASRTARAHVFHPIRQSIRSPVRGRPGDPRAAH 

TRSAGTRLQCKASRGG*GKGPAPTR*EGGPGSAP 

APLPASSGCSLFPDSSPWTPPPPAPGAAAAQP**T 

PRCPAALRAGAHIGRVGRPY 


3316 


A 


3 


2307 


NHLGTLMQNWDSSSRVPFSSGQHSTQSFPPSLMS 

KSNSMLQKPTVAYVRPMDGQESMEPKLSSEHYSS 

QSHGNSMTEUCPSSKAHLTKLKIPSQPLDASASG 

DVSCVDEILKEMTHSWPPPLTAIHTPCKTEPSKFP 

FPl'KESQQSNFGTGEQKRYNPSKTSNGHQSKSM 

LKDDLKLSSSEDSDGEQDCDKTMPRSTPGSNSEP 

SHHNSEGADNSRDDSSSHSGSESSSGSDSESESSS 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, D=Aspartic Acid, 
E=Glutaraic Acid, ^Phenylalanine, G=Glycine, H»Histidine, 
I=Isoleucint, K=Lysine, L^Leucine, M»Methlonlne, 
N=Aaparagine, P^Proline, Q=Glutaralne, R-Arginine, S=Serinc, 
T=Threonlne, V=Val ine, W«Try ptophan, Y=Tyrosi ne, 
X«Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 




• 






SDSEANEPSQSASPEPEPPPTNKWQLDKWLNKV 

NPHKVSPASSVDSNIPSSQGYKKEGREQGTGNSY 

TDTSGPKETSSATPGR\APKPIQKGSESGRGRQKS 

PAQSDSTTQRRTVGKKQPKKAEKAAAEEPRGGL 

KffiSETPVDLASSMPSSRHKAA-nCGSRXPNIKKES 

KSSPRPTAEKKKYKSTSKS SQKSREIDETDTSSSDS 

DESESLPPSSQTPKYPESNRTPVKPS S VEEEDSFFR 

QRMFSPMEEKELLSPLSEPDDRYPLIVKIDLNLLT 

RIPGKPYKETEPPKGEKKNVPEKHTREAQKQASE 

KVSNKGKRKHKNEDDNRA SESKKPKTEDKNS A 

GHKPSSNRESSKQSAAKEKDLLPSPAGPVPSKDP 

KTEHGSRKRTISQSSSLKSSSNSNKETSGSSKNSS 

STSKQKXTEGKTSSSSKEVKVKAPSSSSNCPPSAP 

TLD SSKPRRTKL VFDDRNY S ADHYLQEAKKLKH 

NADALSDRFEKAVYYLDAWSFBBCGNALEKNA 

QESKSPFPMYSETVDLI 


3317 


A 


496 


2 


NLLQDEKLVHSYPYDWRTQETCGYIVPARQWFI 

N\TRDIKTAAKELLKKVKFIPGSALNGMVEMMD 

RKPYWCISRQRVWGVPIPVFHHKTKDEYLINSQT 

TEHIVKLVEQHGSDIWWTLPPEQLLPKEVLSEVG 

GPDALEYVPGQDILDIWFDSGTSWSYVLPGPD 


3318 


A 


2 


512 


AWHEGDSRSDQCHHPYNYGFDYYYGMPFTLVD 

SCWPDPSRNTELAFESQLWLCVQLVAIABLTLTF 

GKLSGWVSVPWLLIFSM1LF1FLLGYAWFSSHTSP 

LYWDCLLMRGHEITEQPMKAEVRAGSIMVKEAIF 

LFRKGHSKGKLFLLFFLPFLQVHK1 FK ITDGFH W 

AP 


3319 


A 


407 


1 


SSLHRSPRPASPLPVPEAPXSFLPVPAPKPSALPPFS 
LSGAPSSASTFSPHSSPSPASPTPAPSPQSPFPSRPT 
SPPSLTPTRRPPLPADRRGPHLLYQPLHAPLEAAA 
TGPE/PSAAAGRLPRPRPPWRAAYPASR 


3320 


A 


4037 


3432 


QMSEAVAEKMLQYRRDTAGWKICREGNGVSVS 

WRPSVEFPGNLYRGEGIVYGTLEEVWDCVKPAV 

GGLRVKWDENVTGFEIIQSITDTLCVSRTSTPSAA 

MKLISPRDFVDLVLVKRYEDGTISSNATHVEHPL 

CPPKPGFVRGFNHPCGCFCEPLPGEPTKTNLVTFF 

HTDLSGYLPQNWDSFFPRSMTRFYANLQKAVK 


3321 


A 


37 


360 


SHSA SG AGRPAAP AADLRP APNGQRPGPRLGAR 
ALWLPPRGRPDEAGRLPGEHLPQVPWDPGLTRS 
PSPRGPCRGAARAGHVGETPAPWGCPPPCAWEH 
KGPGSEGTP 


3322 


A 


1 


420 


ATVEDKHSGRSYDITSDLGNVLTSTSIAKTVNG*A 

ESSDSGAESDEEDAQEDLMGAYHSDIDKKMMKI 

VADHKNLEVIVTNGYDKDGFVHDIQNDIHASSSL 

NGRSTVHVKPIDENLGQTGKSAVCIHQDINDDH 

VEDVT 


3323 


A 


8 


459 


DTLSLNCHXPETLPMTPSF*LSFL*FPGLARAKSIP 
TKTYSNE VVTL WYRPPDILLGSTDYSTQIDMW* G 
QVEVWQGPCGKGGGLVTTATQPAAFLFTVPSLP 
RGVGCIFYEMATGRPLFPGSTVEEQLHFIFRILSE 
EAWALCAVETHR 


3324 


A 


1276 


466 


PGSTHASARITIY*L*IILSNATEVDNNFSKPPPFFP 
AGAPPASSSSSSSSSSPPTVSTAPPLIPPPGFPPPPG 
APPPSLIPTIESGHSSGYDSRSARAFPYGNVAFPH 
LPGSAPSWPSLVDTSKQWDYYARSSSSSSSSSSSS 
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SEQID 
NO: 


Method 

• 


Predicted 

besinnin? 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


! Predicted end 
nucleotide 

•* W M V V Ml VIV 

location 
corresponding 
! to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»AIanlne OCysteine, D=Aj parti c Add, 
VW7lnfAmie Add. FHPhenv (alanine. €U=GYv*lne H«Hltiiffiti» 

I-Isoleudne, K«=Lysine, I^Leucine, M=Mcthioninc, 
N=Asparaginc, P=*roIine, Q=Glutaminc, R^Arginine, S°Serine, 
T^Tureooine, V«Valine, W«Tryptophan, Y«Tyrosine, 
X-Unknown, *«=Stop codon,/=possibJe nudeotide deletion, 
\=possib!e nudeotide insertion 










SSSPRDRDRBR* RTRERERERDHSPTPSVFNSDEE 
RYRYREYAERGYERHRASREKEERHRERRHREK 
EETRHKSSRSNSRRRHESEEGDSHRRHKHKKSKR 
SKEGKEAGSEPAPEQESTEATPAE 


3325 


A 


266 

• 


3312 


TCLFSASCSSLPSPSSSFALLSTENTQRTYRVNPD 

GSLRVTFASGNffilGLSSEPHILAGAVNPTLGKCNI 

SLPGEHNANLISVL* *GEQGCA*NVFHISFS* AHN 

RNLLSroroHITRTGKJTVT)DHRKFTLRILYDQTGR 

PDLWSPVSRYNEVNITYSPSGLVTFIQRGTWNEK 

MEYDQSFL*SPQL*LSIICYSAFVSFQSVMLLLHS 

QRRYIFEYDQPDCLLSVTMPSMVRHSLQTMLSV 

GYYKNIYTPPDSSTSHQDYSRDGRLLQTLHLGTG 

RRVLYKYTKQARLSEVLYDTTQVTLTYEESSGD 

LSDSSTLIA*LLTVFVLVPAGPLIGRQIFRFSEEGL 

VNARFDYSYNNFRVTSMQAVINETPLP1DLYRYV 

DVSGRTEQFGKFSVINYDLNQVITTTVMKHTKIF 

SANGQVIEVQYEILKAIAYWMTIQYDNVGRMVI 

CDIRVG VDANITRYFYE YDADG QLQTVS VNDKT 

QWRYSYDLNGNINLLSHGKSARLTPLRYDLRDRI 

TRLGEIQYKMDEDGFLRQRGNDIFEYNSNGLLQ 

KAYNKASGWTVQYYYDGLGRRVASKSSLGQHL 

QFFYADLTNPIRVTHLYNHTSSEITSLYYDLQGH 

LIAMELSSGEEYYVACDNTGTPLAVFSSRGQVIK 

EDL YTP YGDIYHDTYPDFQ V1IGFHG GLYDFLTKL 

VHLGQRDYDVVAGRWl'lPNHHIWKQLNLLPKP 

FNLSTKLIKYGIFHFLFLILCLTDIRSWLELFGFQL 

HNVLPGFPBCPELENSPSI*QMSNSMLHLLCASLS* 

TELGIQCELQKQLRNFISLDQLPMTPRYNDGRCLE 

GGKQPRFAAVPSVFGKGDCFAIKDGIVTADnGVA 

NEDSRRLAA1LNNAHYLENLHFTIEGRDTHYFIK 

LGSLEEDLVLIGNTGGRRILENGVNVTVSQMTSV 

LNGRTRRFADIQLQHGALCFNIRYGTTVEEEKNH 

VLEIARQRAVAQAWTKEQRRLQEGEEGIRAWTE 

GEKQQLLSTGRVQGYDGYFVLSVEQ 


3326 


A 

• 


290 


1041 

• 


KACLHLLSSFLTSNFLFNPLLPDSLYSVEARSQRA 
NLGPCRRKRLQTLMRLAAGFQYSSHKDPSLSAK 
EKHTDYHNEARGPWPG WVG* RTADGSCGRGPD 
GAHHPGPKSS S WRA SRLLPGLGGSHHLDA YVGR 
DLECGTPAPLQLEIPPQPRGHPAPIPTGQAGPRDS 
GPGASP # VETRPLTDGRR*PGVRPVGWTPAHPAG 
TLRPRGAVEPSVSACGKWAPSPTSQGCCEGRCD 
AVPKHRAWRTPLCSQ 


3327 


A 


1 


418 


CSECGKSFCKKSKFIIHQRTHTGEKPYECNQCGK 

SFCQKGTLTVHQRTHTGEKPYECNECGKNFYQK 

LHLIQHQRTHSGEKPYECSYCGKSFCQKTHLTQH 

QRTHSGERPYVCHDCGKTFSQKSALNDHQKIHT 

GVKLY 


3328 


A 


1 


270 


VTRKLPIFIVDAFTARAFRGSPAADCLLENELDED 
MHQKIAREMNLSETAFIRKLHPTDNFAQRSCFGL 
IWFIPTTOLQILTSSILPSIL 


3329 

• 


A 


45 


419 


EELSCWQIWQQIANDLTRCQDSMINNSQCHKQG 
DFPYQVGTEI^IQISEDENYIVNKADGPNNTGNP 
EFPILRTQDSWRKTFLTESQRLNRDQQISIKNKLC 
QCKKGVDPIGWISHHDGHRVHKR 


3330 


A 


64 


430 


FWRNFTGLAPAAAVATTTSSSTN4RFTSISNSLTST 
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cpn in 

NO: 


(Yietnofl 


rreuicteu 

beginning 

nucleotide 

location 

corresponding 

lO lira I BiDino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acta residue oi 
peptide 
sequence 


Amino acid sequence (A**>Alanine C=Cysteine, D^Aspartic Add, 
£«G!utamic Acid, F-Fbeoylalanine, G=Giycine, H=Histidine, 
I=Isoleudne, K«=Lysine, LHLeudne, M=Methionlne, 
N^Asparaginc, P=Proline, Q=Glutamf ne, R«=*Argi nine, S=Serlne, 
T»Tbreonine, V=Vnlioe, W=Tryptophan, Y«Tyrosine, 
a— unknown, =otop eoaon, r=possiDie nucleotide deletion, 
^possible uudeotide insertion 










AAIGLSFTTSTTTTATFITNTTTTITSGFTVNQNQ 
LI^RGFENLVPYTSTVSWTTPVMTYGHLEGLIN 
EGNLELEIKRRLSSQATQ 


3331 


A 


3 


407 


TFGCSCTDCFFQKCCPAEAGVLLAYNKNQQIKIP 
PGTPIYECNSRCQCGPDCPNKIVQKGTQYSLCIFR 
TSNGRGWGVKTLVKIKRMSFVMEYVGEVITSEE 
AERRGQFYDNKGITYLFDLDYESDEFTVDAARY 


3332 


A 


25 


461 


PAADFVLQARPTRADILGIHSKYDEVRKAGACFY 

KMTGLGPGPQAXYNGEPFKHEEMNIKELKMAVL 

QRMMDASVYLQREVFLGTLNDRTNAIDFLMDR 

NNVWRINTLILJITNQQYLNLLSTSVTADAEDFS 

TFFFLDSQDKSA 


3333 


A 


317 


54 


AWIIFLPPLTSCPLWAPGTKHKTELEARSGLGPIK 

AYPRLGPPTPGEPEAPAQDRTFHCEICNVKVNSK 

VQLKQHISSRRHEIVDPV 


3334 


A 


304 


410 


AGPSLPSNLRQIFQSLPPFMDILLLLLFFMIIFAI 


3335 


A 


19 


418 


VESRNSRVQPRVRLNDRTNAIDFLMDRNNVVPRI 
hnTLILRTNQQYIJsn^ISTSVTADVEDFSTFFFLDSQ 
DKJSAVLAECNMYYLTQDDESIISAATLWIIADFDK 
PSGRKLLFNALKHMITSVHSRVGIIYNPFF 


3336 


A 


1 


1003 

- 


PSSYSSDELSPGEPLTSPPWAPLGAPERPEHLLNR 

VLERL AG G ATRDSA ASDILLDDIVLTHSLFLPTEK 

FLQELHQYFVRAGGMEGPEGLGRKQACLAMLL 

HFLDTYQGLLQEEEGAGHIIKDLYLLIMKDESLY 

QGLREDTLRLHQLVETVELKIPEENQPPSKQVKP 

LFRHFRRIDSCLQTRVAFRGSDEIFCRVYMPDHS 

YVTIRSRLSASVQDILGSVTEKLQYSEEPAGREDS 

LILVAVSSSGEKVLLQPTEDCVFTALGINSHLFAC 

TRDSYEALVPLPEEIQVSPGDTEIHRVEPEDVANH 

LTAFHWELFRCVHELEFVDYVFHGE 


3337 


A 


444 


43 


KUXCLANQFPDISFCPALPAWALLLHYSIDEAE 
CFEKACRILACNDPGRRLIDQSFLAFESSCMTFGD 
LVNKYCQAAHKLMVAVSEDVLQVYADWQRWL 
FGELPLCYFARVFDVFLVEGYKVLYRVALAXXF 


3338 


A 


1 


398 


FRGKVRGRSAEMPGSDTALTVDRTYSDPGRHHR 
CKSRVERHDMNTLSLPLNIRRGGSDTNLNFDVPD 
GILDFHKVKLTADSLKQKILKVTEQIKIEQTSRDG 
NV AEYLKL VNN ADKQ Q AGRIKQ VFEKJCNQK 


3339 


A 


1 


665 


AAAASWGLronVNSIVGVSVLTMPFCFKQCGI 

VLGALLLVFCSWMTHQSCMFLVKSASLSKRRTY 

AGLAFHAYGKAGKMLVETSMIGLMLGTC1AFYV 

VIGDLGSNFFARI^GFQVGGTTRMFLXJFAVSLCI 

\^P1^QRNMMASIQSFSAMALLFYTVFMFVIVL 

SSLKHGLFSGQWLRRVSYVRWEGVFRCIPIFGMS 

FACQSQVLPTYDSLDEPSV 


3340 


A 


198 


367 


LLPLQVLQEAFSRCVAVLTRSSKPSDMSVQVCG 
YISKCYSVAAQFEECREKITEMP 


3341 


A 


562 


277 


HSVIKRTPRKYLAEIVLIDDFSNK£HLKEKLDEY1 ! 

KLWNGLVKVFRNERREGLIQARSIGAQKAKLGQ 

VLIYLDAHCEVAVNWYAPLVAPISKDR 


3342 


A 


385 


2 


NLTWWPLFRDVSFYIVDUMLllF'FLDNVIMWWE 
SLLLLTAYFCYVVFMKFNVQVEKWVKQMINRN 
KWKVTAPEAQAKPSAARDKDEPTLPAKPRLQR 
GGSSAS1JHWSLMRNSIFQNKJHTLDPHV 


3343 


A 


1 


385 


FRVDNSEEWKDVFUSSERSFKLDSLKCGTWYKV 
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OLj\£ IMJ 

NO: 


Ik/fa thrt«1 


rrcoicteu) 

beginning 

nucleotide 

location 

corresponding 

tn flirt nmino 

add residue of 

peptide 

sequence 


jrreaietcd end 
nucleotide 
location 
corresponding 
to last amino 

peptide 
sequence 


Amino acid sequence (A—AJanme u^cysteine, D=" As parti c Add, 
EXJIuteraic Acid, ^Phenylalanine, G=Glycine,H~HisHdiDc, 
I»IsoIeucine, K=Lysine, LHLeucine, M=Methioninc, 
N=Asparaglne, P-Proline, Q=GJutamine, R«Arginine, S=Serine, 
T=Threonlne, V«Vallne, W-Tryptophan, Y-Tyrosine, 

A'-UDKPUHD, "VIUp CUUUIlj f u |ltlooi UIC UUCICOIlUC aeienon, 

Vpossfble nucleotide insertion 










KIJVAKNSVGSGRISEIIEAKTHGREPSFSKDQHLF 
THINSTHARLNLQGWNNGGCPITAIVLEYRPKGT 
WAWQGLRANSSGEVFLTELREATWY 


3344 


A 


351 


147 


SPACITSSLSQHIADPRAAPTEVKVRVMNSTAISL 
QWNRVYSDTVQGQLREYRVRKPAPDSPNYPAH 


3345 


A 


351 


147 


SPAClTSSI^QHL\DPRAAPreVKVRVMNSTAISL 
Q WNRVY SDTVQGQLREYRVRKPAPDSPNYPAH 


3346 


A 


3 


1509 


AGIRHEAPPTTSNRHRRQIDRGVTHLNISGLKMP 

RGIAIDWVAGNVYWTDSGRDVIEVAQMKGENR 

KTLISGMIDEPHAIVVDPLRGTMYWSDWGNHPK 

IETAAMDGTLRETLVQDNIQWPTGLAVDYHNER 

LYWADAKLSVIGSIRLNGTDPIVAADSKRGLSHP 

FSIDVFEDYTYGVTYINNRVFKIHKPGHSPLVNLT 

GGLSHASDVVLYHQHKQPEVTNPCDRKKCEWL 

CLLSPSGPVCTCPNGKRLDNGTCVPVPSFITPPD 

APRPGTCNLQCFNGGSCFLNARRQPKCRCQPRY 

TGDKCELDQCWEHCRNGGTCAASPSGMPTCRCP 

TGFTGPKCTQQVCAGYCANNSTCTVNQGNQPQ 

CRCLPGFLGDRCQYRQCSGYCENFGTCQMAAD 

GSRQCRCTAYFEGSRCEVNKCSRCLEGACWNK 

QSGDVTCNCTDGRVAPSCLTCVGHCSNGGSCTM 

NSKMMPECQCPPHMTGPRCEEHVFSQQQPGHIA 

SILIP 


3347 


A 


974 


666 


SPEMESHPITQAGVQ WHHLSSLQPLPPGFK* FSCF 

SLPE*LGYRHVPPCLANfSVFSVEMG\FLHVGQAG 

LELLTSGDLPALASQSAGITGXSHRARPENGFENIF 


3348 


A 


1 


1171 


LSKITMPVICNEPLSFIQRJLTEYM*HTYFIHRPSSL 

S DPVDRMQCVAAFA VSA VASQWERTGKPFNPLL 

GETYELVRDDLGFRLISEQVSHHPPISAFHAEGLN 

NDFIFHGSIYPKLKFWGKSVEAEPKGTITLELLEH 

NEAYTWT^^CCVH^raVGKLWIEQYG^^V^IINH 

KTGDKCVLNFKPCGLFGKELHKVEGY1QDKSKK 

KLCALYGKWTECLYSVDPATFDAYKKNDKKNT 

EEKKNSKQMSTSEELDEMPVPDSESVFIIPGSVLL 

WRIAPRPPNSAQMYNFTSFAMVLNEVDKDMESV 

IPKTDCRLRPDIRAMENGEIDQASEEKKRLEEKQ 

RAARKNRSKSEEDWKTRWFHQGPNPYNGAQD 

WIYSGSYWDRNYFNLPDIY 


3349 


A 


403 


497 


NFASSSGKYLRTQKIKCLNNKFTPFPTTEKK* SQS 
VRPP*SNRJY*ILQSn*ISFS*LPN*NFASSSGKYLR 
TQK1KCLNNKFTPFPTTEKK 


3350 


A 


1 


712 


GAPAQDCICLPFPFHSSFLESDIRKPARRKIQTTNP 

DFLLLLFMSVPVVSAPPFCPPAEGSRDGRPKASV 

ARPAAVHEHHSPRDCGHLFDVIRSSLGGWQPH*P 

AQPENRLL*LLPVE* GHQHPTVSPVP* AGSPGG AS 

GWPGPGQAWRVRVPGPHPLCPPASPPSPVQQ**E 

SVAAGSGLPGCVLCAAGRRPGPLPLLCVEVGQA 

UPPGAWVSSSGQRPGLTHOPLAYSHGCVPSEG 


3351 


A 


1 


428 


MAAWAATALKGRGARNARVLRGDLAGATANK 

ASHNRTRALQSHSSPEGKEEPEPLSPELEYIPRKR 

GKNPMKAVGLAWAIGFPCGILLFILTKREVDKDR 

VKQMKARQNMRLSNTGEYESQRFRASSQSAPSP 
DVGSGVQT 


3352 


A 


2 


841 


RTLFRGRRRREDDRISRPHPSTAESKAPTPKFDLL 
ASNFPPLPGSSSRMPGELVLENRMSDWKGVYK 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alan!ne OCysteine, D^Aspartic Acid, 
E=Clutamic Acid, ^Phenylalanine, G-Glyclne, H»Histidlne, 
I^lsoleuclne, KHLysine, L^Leucine, M=Methlonlne, 
N»Asparagine, P»Proline 9 Q=Glutam!oe, R=Argiaine, S=Serine, 
T^Threonine, V=Valine, W^Tryptophan, Y^Tyrosine, 
X=Unknown, *«=Stop codon, /=possibIe nucleotide deletion, 
Vspossible nucleotide insertion 










EKDNEELTISCPVPADEQTECTSAQQLNMSTSSP 

CAAELTALSTTQQEKDLEEDSSVQKDGLNQTTIP 

VSPPSTTKPSRASTASPCNNNINAATAVALQEPR 

KLSYAEVCQKPPKEPSSVLVQPLRELRSNWSPT 

KNEDNGAPENSVEKPHEKPEARASKDYSGFRGN I 

IIPRGAAGKJREQRRQFSHRAIPQGVTRRNGKEQ 

YVPPRSPK 


3353 


A 


1054 


587 


IATPTWTAPLTATPTPAHQYGPARVPNGAPRLEP 
PPGKKECRVGQYVVDLTSFEQI^LPVLRNADCS 
SGPGQRVCVIDEIGKMELFSQLHQAVRQTLSTPG 
TJDQLGTIPVPKGKPLALVEEIRNRKDVKVFNVTKE 
NRNHLLPDIVTC VQS SRK 


3354 

■ 


A 


56 


1268 

- 


GMEPVGCCGECRGSSVDPRSTFVLSNLAEWER 

VLTFLPAKAIXRVACVCRJ.WRECVRRVLRTHRS 

VTWISAGIAEAGHLEGHCLVRVVAEELJENVRILP 

HTVLYMADSETFISLEECRGHKRARKRTSMETA 

LALEKLFPKQCQVLGIVTPGIWTPMGSGSNRPQ 

EIEIGESGFALLrTQIEGIKIQPFHFIKDPKNLTLER 

HQLTEVGLLDNPELRWLVFGYNCCKVGASNYL 

QQWSTFSDMNIILAGGQVDNLSSLTSEKNPLDI 

DASGWGLSFSGHRIQSATVLLNEDVSDEKTAEA 

AMQRLKAANIPEHNTIGFMFACVGRGFQYYRAK 

GNVEADAFRKFFPSVPLFGFFGNGEIGCDRIVTG 

NFILRKCNEVKDDDLFHSYTTIMALIHLGSSK 


3355 

■ 


A 


1 


707 


GTSSGLGGDRLAAPGPSPPSFYPQGRGERAYDIY 

SRLLRERIVCVMGPIDDSVASLVIAQLLFLQSESN 

KKPIHMYINSPGGVVTAGLAIYDTMQYILNPICT 

WCVGQAASMGSLLLAAGTPGMRHSLPNSR1MIH 

QPSGGARGQATDIAIQAEEIMKLKKQLYN1YAKH 

TKQSLQVIESAMERDRYMSPMEAQEFGILDKVL 

VHPPQDGEDEPTLVQKEPVEAAPAAEPVPAST 


3356 


A 


352 


338 


FNYNFCRhn.HMPSFLV*PGMCGLLAKHLSFHIVG 

AFLIT/LGVAALCKFAVA*PRKKAYAJDFYRNYN* 

IKEFEVRKANISQSTK 


3357 


A 


1 


403 


ALGSCGGLLGTGLLKGTMSGTLWSKGIFAGYKR 
RIRIQREHTAVLKJEGWYARDETEFYLRMICANV 
YKANNNTVTPVLTPDKTRVMWRKVTQAHGISI 
MVRAQFRTNLPADAIGHRIRMML*PSRMYTTEPS 


3358 


A 


71 


2897 


FCSKDKCCLYLPDSINRSKSCTAKPGAHSQDRHA 

VMDSERQVKDTDDEESPKRSIRDSGYIDCWDSER 

SDSLSPPRHGRDDSFDSLDSFGSRSRQTPSPDWL 

RGSSD GRGSDSESDLPHRKLPDVKKDDMSARRT 

SHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKK 

AEREEYRKSWSTATSPAGLGKKALQDYGPRTVPV 

SVDDAESTSMFDMRCEEEAAVQPHSRARQEQLQ 

LINNQLREEDDKWQDDLARWKSRKRSVSQDLIK 

KEEERKKMEKLLAGEDGTSERRKSDCTYREIVQE 

KERRERELHEAYKNARSQEEAEGILQQYIERFTIS 

EAVLERLEMPKILERSHSTEPNLSSFLNDPNPMK 

YLRQQSIJPPPKJ^ATVETTIARASVLDTSMSAGS 

GSPSKTVTPKAVPMLTPKPYSQPKNSQDVLKTFK 

VDGKVSVNGETVHREEEKERECPTVAPAHSLTK 

SQMFEGVARVHGSPLELKQDNGSIEINIKKPNSV 

PQELAATTEKTEPNSQEDKNDGGKSRKGNIELAS 

SEPQHFTTTVTRCSPTVAFVEFPSSPQLKNDVSEE 



310 



WO 01/57190 PCTYUS01/04098 



| 5EQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


f Predicted end 

nucleotide 
1 location 

corresponding 

to last amino 
I add residue of 
1 peptide 

sequence 


Amino acid sequence (A^Alnnine OCysteine, D=Aspartic Acid, 1 
E^Glutamlc Add, ^Phenylalanine, G^Glycine, H=Histidine, 
I«Iso)eudne, K=Lysinc, Is^Leudne, M«Metbionine, 
N*=Asparagine, P=Proliue, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W«Tryptophan, Y-Tyrosine, 
X«Unknown, *»Stop codon, ^possible nudeotide deletion, 
\ppossible nucleotide insertion 




* 


• 




KDQKKPENEMSGKVELVLSQKVVKPKSPEPEAT 
, i^irJrrJLlJJ\JVlJrtiAlN^ 

TTPFKFWAWDPEEERRRQEKWQQEQERLLQER 

YQVKEQDK\LKEE\WEKAQKEVEEEERRYYEEEP* 

II\EDPWPFTVSSSSAIX?LSTSSSMTEGSGTMNKI 

DRLEEKGSLTEGALAHSGNPVSKGVHEDHQLDT 

EAGAPHCGTNPQLAQDPSQNQQTSNPTHSSEDV 

KPKTLPLDKSINHQIESPSERRKSISGKKLCSSCGL 

r J^OlvOAAJVtLUCr 1 L,ri JL. Y r rtlV^L*rRCG\ICKGQLGDA 

VSGTDVRIRNGLLNCNDCYMRSRSAGQPTTL | 


3359 


A 


3 


368 


EVTASREGRGACAWECGSSRGPWGLLRGTFAPV 
RAATP* S*LPKGSLRHRP*/CPPP VHLPPKSSCPPR [ 

AWAGRATSM*TSSYSSEYQPQTP*ALVTLPPRSY 
YIXTHLLTLTHLHHQILFEP | 


3360 

* 


A 


2 


392 


ARGIGSLGRDHSGSGGGTGMAGAWVRKAADYV " 
RSKDFRDYIJVTSTHFWGPVANWGLPIAAITDMKV 
KSPEIISRRMTFAL* CYSLTFVRFAHYVQVPWNWL 
MLGCHTAVDFDQLISSMPCISHGMTASASAL | 


3361 


A 

• 


4619 


532 

* 


LLLGRANSPPYNSWRTLPPATLLLRRAGWESF 

WSCQSRSPWPPRPEVRAPAKGPRGVAGAAGACS 

AGARJLGDAAGGDPASGQAARGCGARAPRGLGR j 

TARARDTAMEDAGAAGPGPEPEPEPEPEPEPAPE 

PEPEPKPGAGTSEAFSRLWTDVMGILDGSLGNID 

DLAQQYADYYNTCFSDVCERMEELRKRRVSQD 

LEVEKPDASPTSLQLRSQIEESLGFCSAVSTPEVE j 

RKNPLHKSNSEDSSVGKGDWKKKNKYFWQNFR 

KNQKGIMRQTSKGEDVGYVASEITMSDEERIQL j 

MMMVKEKMTnEEALARLKEYEAQHRQSAALDP 

ADWPDGSYPTFDGSSNCNSREQSDDETEESVKF 

KRLHKLVNSTRRVRKKLIRVEEMKKPNSTEGGEE 

HVFENSPVLDERSALYSGVHKKPLFFDGSPEKPP 

EDDSDSLTTSPSSSSLDTWGAGRKLVKTFSKGES 

RGLDCPPKKMGTFFS YPEEEKA QK VSRSLTEGEM 

KKGLGSLSHGRTCSFGGFDLTNRSLHVGSNNSDP 

MGKEGDFVYKEVIKSPTASRISLGKKVICSVKET 

MRKRMSKKYSSSVSEQDSGLDGMPGSPPPSQPD 

PEHLDKPKLKAGGSVESLRSSLSGQSSMSGQTVS 

TTDSSTSNRESVKSEDGDDEEPPYRGPFCGRARV 

HTDFTPSPYDTDSLKLKKGDIIDIISKPPMGTWMG 

LLNNKVGTFNFIYVDVLSED\EEKPKRPTRRRRK 

GRPPQPKSVEDLLDRINLKEHMPTFLFNGYEDLD 

TFKLLEEEDLDELNIRDPEHRADLLTAVELLQEY 

DSNSDQSGSQEKLLVDSQGLSGCSPRDS^CYESS 

ENLENGKTRKASLLSAKSSTEPSLKAFSRNQLGN 

YPTLPLMKSGDALKOGOEEGRLGGGLAP\DT^K'^ I 

CDPPGC*LVLN\KNRRKPPSFPSCRSOETL\EGPQ 

TVDTWPRSHSLDDLQVEPGAEQDVPTEVTEPPPQ 

IVPEVPQKTTASSTKAQPLEQDSAVDNALLLTQS 

KRFSEPQKLTTKKLEGS1AASGRGLSPPQCLPRNY 

DAQPPGAKHGLARTPLEGHRKGHEFEGTHHPLG 

TKEG VD AE QRMQPKIPSQPPP VPAKKSRERLANG 

LHPVPMGPSGALPSPDAPCLPVKRGSPASPTSPSD 

CPPALAPRPLSGQAJLGSPPSTRPPPWLSELPENTS 

LQEHGVKLGPALTR\KVSCARGVDLETLTENKL\ 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«AIanine OCysteine, D^Aspartlc Add, 
E«=Glutamic Add, F*»Pheny tola nine, G*=Glydne, H=Histidinc, 
I^Isoleudne. KpLvsinc. L^Leudne. M^Methionine. 
N»Asporagine, PaProline, Q=Glutamine, R-Arginine, S=Serine, 
T«Threonine, V-Valioe, W»Tryptophan, Y=Tyrosine, 
X<=Un known, *«Stop codon, A»possibIc nudcotide deletion, 
V^possible nucleotide insertion 










HAEGIRSSRREP YS *LRHGRCGI\PVEALVQRYAED 

LIXJPEIU3VAANMIXJIRVKQUIKQHRMAIPSGGL 

TEICRKPVSPGCIS\SVSDWLIS1GLPMYAGTLSTA 

GFSTL\SQVPSLSHTCLQEAG\ITEERHIRK\LLSAA 

RLFKLPPGPEAM 


3362 


A 


1 


4653 


FRG G VG YAHTLHLLPF AGS S WLARARRTDR WT " 

SGLVEMATLSLTVNSGDPPLGALLAVEHVKDDV 

SISVEEGKENILHVSENVLh-l'DVNSILRYLARVAT 

TAGLYGSNLMEHTEIDHWLEFSATKLSSCDSFTS 

TTNELNHCLSLRTYLVGNSLSLADLCVWATLKG 

N AA WQEQLKQKKAP VHVKR WFGFLEAQQ AFQ S 

VGTKWDV STTKARVAPEKKQD VGKFVELPG AJB 

MGKVTVRFPPEASGYLHIGHAKAALLNQHYQV 

NFKGKLIMRFDDTNPEKEKEDFEKVELEDVAML 

fflKPDQFTYTSDHFETLMKYAEKLIQEGKAYVDD 

TPGEQIKAEREQRIESKHRKNPIEKNLQMWEEMK 

KGSQFGH SCCLRAKJDMSSNNGCMRDPTLYRCK 

IQPHPRTGN* YNN V\YPTYDFACPIVDSIEGVTHAL 

RTTEYHDRDEQFYWIIEALGIRKPYrWEYSRLNL 

NNTVLSKRK1,TWFVNEGLVDGWDDPRFPTVRG 

VLRRGMTVEGLKQFIAAQGSSRSVVNMEWDKI 

WAFNKKV IDP VAPR YVALLKKE VTP VN VPEAQE 

EMKEVAKHPKNPEVGLKPVWYSPKVFIEGADAE 

TFSEGEMVTFIhTWGNLNITKIHKNADGKIISLDAK 

LNLENKDYKKTTKVTWLAETTHALPIPVICVT^ 

HLITKPVLGKDEDFKQYVNKNSKHEELMLGDPC 

LKDLKKGDIIQLQRRGFFICDQPYEPVSPYSCKEA 

PCVLIYIPDGHTKEMPTSGSKEKTKVEATKNETS 

APFKERPTPSLNNNCTTSEDSLVLYNRVAVQGD 

VVRELKAKKAPKEDVDAAVKQLLSLKAEYKEK 

TGQEYKPGNPPAEIGQNISSNSSASILESKSLYDE 

VAAQGEVVRKLKAEKSPKAKINEAVECLLSLKA 

QYKEKTGKEYIPGQPPLSQSSDSSPTKNSEPAGLE 

TPEAKVLFDKVASQGEVVRKLKTEKAPKDQVDI 

AVQELLQLKAQYKSLIGVEYKPVSATGAEDKDK 

KKXEKENKSEKQNKPQKQNDGQRKDPSKNQGG 

GLSSSGAGEGQGPKKQTRLGLEAKKXEENLADW 

YSQVITKSEM1EYHDISGCTILRPWAYAIWEAIKD 

FFDAJEIKKLGVENCYFPMFVSQSALEKEKTHVA 

DFAPEVAWVTRSGKTELAEPIAJRPTSETVMYPA 

YAKWVQSHRDLPEKLNQWCNWRWEFKHPQPF 

LRTREFLWQEGHSAFATMEEAAEEVLQILDLYA 

QVYEELl^IPVVKGRKTEKJEKFAGGDYTTTIEAF 

ISASGRAIQGGTSHHLGQNFSKMFEIVFEDPKIPG 

EKQFAYQNSWGLTTRTIGVMTMVHGDNMGLVL 

PPRVACVQVVIIPCGITNALSEEDKEALIAKCNDY 

RRRLLSVNIRVRADLRDNYSPGWKFNHWELKG 

VPIRLEVGPRDMKSCQFVAVRRDTGEKLTVAEN 

EAETKLQA1LEDIQVTXFTRASEDIXTHMVVANT 

MEDFQKILDSGKIVQIPFCGEIDCEDWIKKTTARD 

QDLEPGAPSMGAKSLCIPFKPLCELQPGAKCVCG 

KNPAKYYTLFGRSY 


3363 


A 


3797 


1514 


LGGAAPETMPFPVTTQGSQQTQPPQKHYGITSPIS 

LAAPKETDCVLTQKVLAETLKPFGGFLKKEEGTA 

SRRNFNFGKN*rNLVKEWIRRNQ*KAKNLPQSVA 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence . 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D=>Aspartic Acid, 
E^GIutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
Msoleucine, K=Lysine, L=>Leutine, M=Melhionine, 
N-Asparaglne, P-Proline, Q— Glutamine, R«Arginine, S=Serine, 
^Threonine, V-Valine, W-Tryptopban, Y«=Tyroslne, 
X^Un known, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 








• 


EhTV^GGKJFT/FLGSYRI^GEVHTKGADIIXjVCVF 

APRHVDRSDFFT^SFYDKLKLQEEVKDLRAVEEA 

FVPVIKLCFDGIEID1LFARLALQTIPEDLDLRDDS 

LLKNLDIRCII^LNGCRVTDEILHLVPNIDNFRLT 

LRAIKLWAKRHNIYSNILGFLGGVSWAMLVART 

CQLYPNAIASTLVHKFFLVFSKWEWPNPVLLKQP 

EECNLNLPVWDPRVNPSDRYHLMPETPAYPQQN 

STYNVSVSTRMVMVEEFKQGLAITDEILLSKAE 

WSKLFEAPNFFQK YKHY1 VLLA SAPTENQRLEW 

VGLVESKJRILVGSLEKNEFITLAHVNPOSFPAPK 

ENPDKEEFRTMWVIGL VFKKTENSENL S VDLTY 

DIQSFTDTVYRQAINSKMFEVDMKIAAMHVKRK 

QLHQLLPNHVLQKKKKHSTEG VKLTALNDSSLD 

LSMDSDNSMS VPSPTSATKTSPLNSSGS SQGRNS 

PAPAVTAAS VTNIQATEVS VPQVNSSESSGGTS SE 

SIPQTATQPAISPPPKPTVSRVVSSTRLVNPPPRSS 

GNAATSGNAATKJPTPIVG VKRTS SPHKEESPKK 

TKTEEDETSEDANCLALSGHDKTEAKEQLDTETS 

TTQSETIQTAASLLASQKTSSTDLSDIPALPANPIP 

VIKNSDCLRLNR 


3364 


A 


54 

• 


3073 

• 


SARTMSYDYHQNWGRDGGPRSSGGGYGGGPAG 

GHGGNRGSGGGGGGGGGGRG/WQGPASRAPER 

PRNRHV\OlEKTGAEEQ/WKRRGKREL/LVHN!DE 

RREEQIVQLLNSVQAKNDKESEAQISWFAPEDHG 

YGTEVSTKNTPCSENKLDIQEKKLINQEKKMFRI 

RNRSYIDRDSEYLLQENEPDGTLDQKLLEDLQKK 

KNDLRYIEMQHFREKLPSYGMQKELVNLIDNHQ 

VTVISGETGCGKTTQVTQFILDNYIERGKGSACRI 

VCTQPRRISAISVAERVAAERAESCGSGNSTGYQI 

RLQSRLPRKQGSILYCTTGnLQWLQSDPYLSSVS 

fflVLDEIHERNLQSDVLMTVVKDLLNFRSDLKVI 

LMSATLNAEKFSEYFGNCPMmiPGFTFPVVEYLL 1 

EDVIEKIRYVPEQKEHRCQFKRGFMQGHVNSQE 

KEEKEAIYKERWPDYVRELRRRYSASTVDVIEM 

MEDDKVDLNLIVALIRY1VLEEEDGAILVFLPGW 

DNISTLHDLLMSQVMFKSDKFLHPLHSLMPTVN 

QTQVFKRTPPGVRKJVIATNIAETSITIDDVVYVID 

GGKKETHFDTQNNISTMSAEWVSKANAKQRKG 

RAGXRVQPGSLLFICINGS* EASLLGWTIQLPEIF/R 

GTPLEELCLQDCVLRLGGI/GLFLSRLMDPPSNEA 

VLLSIRQLVRSLNALDKQEELTPLGVHLARLPVEP 

HIGKMILFGALFCCLDPVLTIAASLSFKDPFVIPLG 

KEKIADARRKELAKDTRSDHLTVVNAFEGWEEA 

RRRGFRYEKDYCWEYrXSSNTLQMLHNMKGQF 

AEHLLGAGFVSSRNPKDPESNINSDNEKIIKAVIC 

AGL YPKVAKIRLNLGKKRKM VXVYTKTD GLV A 

VHPKS VNVEQTDFH YNWLJTVHLKMRTS SI YL YD 

CTEVSPYCLIJFFGGDlSIQKI)>n)QETIAVDEWIVF 

QSPAR1AHLVKRAWHMDERREEQIVQLLNSVQ 

AKNDKESEAQIS WFAPEDHG YDKKYFFKE 


3365 


A 


439 

* 


878 


ECCNVRPLRETDLLKMKRKPRASSPVVEEQPRA 

NTKETRKKKSFSQPMSASTKEESQDGRRKGK*L 

KGRARKKNAPQKSMALRILEEGSRPTPSGHSDQL 

NEEL*QNELQLEQ/PEGT*LEQQSEGTQPEQQSGR 

MPTISTLSLSSE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

pcpnoc 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteinc, D=Aspartic Acid, 
E-GIuta ra Ic Acid, F=Pbenylalanine, G=GIydne, KNHistldine, 
I»Isoleucine, K=Lysine, L?=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutaroine, R=Arglolne, S=Serine, 
T«Threonine f V^Valine, W»Tryptophan, Y=Tyrosine, 
X°Un known, *=Stop codon, ^possible nucleotide deletion, 
V=possibIe nucleotide insertion 


3366 


A 


1 


827 


FRGYWGVREAFTDASWSGGLGPGKPGMKITRQ 

KHAKKHLGFFRNNFGVREPYQILLDGTFCQAAL 

RGRIQLREQLPRYLMGETQLCTTRCVLKELETLG 

KDLYGAKLIAQKCQVRNCPHFKNAVSGSECLLS 

MVEEGNPHHYFVATQDQNLSVKVKKKPGVPLM 

FDQNTMVLDKPSPKTIAFVKAVESGNRLSQCMRK 

KVSNISKRNRV* *KTLNRGRRKKRKKISGPNPLS 

CLKKKKKAPDTQSSASEKKRKRKRIRNRSOTKV 

LSEKQNAEGE 


3367 


A 


40 


1467 


MLWGCRAKACWGPRLSDLVA5LSPQRECISVHV 

GQAGVQIGNACWELFCLEHGIQADGTFDAQASK 

INDDDSrTTFFSETGNGKHWRAVMlDLEPTVVD 

EVRAGTYRQLFHPEQLITGKEDAANNYARGHYT 

VGKESIDLVLDRJRKI^TDACSGLOGFLIFHSFGGG 

tgsgftsllmerlsldygkksklefarypapqvs 

tavvepynsiltthttlehsix:afmvdneaiydi 

crrnldierptytnlnrlisqivssitaslrfdgal 

nvdltefqtklwyprihfplvryapiisaekayh 

eqlsvaeitsscfepnsqmvkcdprhgkymacc 

mlyrgdvvpkdvnvaiaaiktkrtiqfvdwcpt 

gfkvginyqpptvvpggdlakvqravcmlsntt 

aiaeawarldhkfdlmyakrafvhwyvgegm 

eegefs*rpgedla\ale\kdyeevgtdsfeeene 

GEEF 


3368 

* 


A 


3 


2597 


SLLEETMDEDSSLREYTVSLDSDMDDASKCLQE 

YDSGTGNTREALRPCPRTVSTKAQPGRSASSSSG | 

DKTTSFAEQKIRKLNHTDGESSGSSSQKTTPEGSE 

LNTPHAGAWAQEPEETGLPQGRDTTQLLASEMV 

HLMMK\LKEKJl\RAI*AQKKKMEAAFTKQRQKM 

GRTAFLTWKKKGDGISPLREEAAGAEDEKVYT 

DRAKEKESQKTDGQRSKSLAD1KESMENPQAKW 

LKSPTTPIDPEKQGNLASPSEETLNEGEILEYTKSI 

EKLNSSLHFLQQEMQRLSLQQEMLMQMREQQS 

WVISPPQPSPQKQIRDFKPSKQAGLSSAIAPFSSDX 

SPRXPTHPSSTSLLNRKSASFSVKSQRTPRPNELKI 

TPLNRTLTPPRSVDSLPRJLRRFSPSQVPIQTRSFVC 

FGDDGEPQLKESKPKEEVKKEELESKGTLEQRG 

HNPEEKEIKPFESTVSEVLSLPVTETVCLTPNEDQ 

LNQPTEPPPKPVFPPTAPKNVNLIEVSLSDLKPPE 

KADVPVEKYDGESDKEQFDDDQKVCCGFFFKD 

DQKAENDMAMKRAALLEKRLRREKETQLRKQQ 

LEAEMEHKKEETRRKTBEERQKKEDERARREFIR 

QEYMRRKQLKLMEDMDTVIKJPRPQVVKQKKQR 

PKSIHRDHIESPKTPIKGPP V SSLSLASLNTGDNES 

VHSGKRTPRSES VEGFLSPSRCG SRNGEKD WEN 

ASTTSSVASGTEYTGPKLYKEPSAKSNKHIIQNAL 

AHCCLAGKVNEGQKKKILEEMEKSDANNFLILF 

RDSGCQFRSLYTYCTETEEINKLTGIGPKS1TKKM 

EEGLYKYNSDRKQFSHIPAKTLSASVDAITIHSHL 

WQTKRPVTPKKLLPTKA 


3369 


A 


977 


594 


RGSGLTQEPGSVGQLALACAEGAVEWLYPAGAL 
RLTLGGPDPRARPGIACLRPVRPFAGAQVFAERA 
GGALELLLAEGPGPAGGRCVRWGPRERRALFLQ 
ATPHQDISRRVAAFRFELREDGRPEIAP 


3370 


A 


345 


1383 


DLSLECTGFKETNLGVYFLSSKWVLRLYALHIID 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
ocntide 
sequence 


Amino acid sequence (A»Alanine C=Cysteine, D=Aspartlc Acid, 
&=Glutaroic Acid, F=PhenylaIanin e , G=Glydne, H=Histidioe, 
Msoleueine, K«Lysine, L^Leudne, M=MethIonine, 
N=Asparagine, P^Proline, Q^Glutamlne, R=Arglnine, S=»Serine, 
T^Tbreonine, V=Voline, W=Tryptophan, Y=*Tyroslne, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
Wnnuthlr ntirlf-nrirlp Insertion 










YSAVLFPC*AMDHLESFIAECDRRTELAKKRLAE 

TQEEISAEVSAKAEKVHELNEEIGKLLAKAEQLG 

AEGNVDESQKILMEVEKVRAJKKKEAEKTVAEK 

QEKRNQDRLRRREEREREERLSRRSGSRTRDRRR 

SRSRDRRiUaiSRSTSRERRKLSRSRSRDRHRRHR 

SRSRSHSRGHRRASRDRSAKYKFSRERASREESW 

ESGRSERGPPDWRLESSNGKMASRRSEEKEAG/G 

DLLNRMTVWKHGLLI 


3371 


A 


345 


1383 


DLSLECTGFKETNLGVYFLSSKWVLRLYALHIID 

YSA\^FPC*AMDHLESFIAECDRRTELAKKRLAE 

TQEEISAEVSAKAEKVHELNEEIGKLLAKAEQLG 

AEGNVDESQKILMEVEKVRAKKKEAEKTVAEK 

QEKRNQDRLRRREEREREERLSRRSGSRTRDRRR 

SRSRDRRRRRSRSTSRERRKLSRSRSRDRHRRHR 

SRSRSHSRGHRRASRDRSAKYKFSRERASREESW 

ESGRSERGPPDWRLESSNGKMASRRSEEKEAG/G 

DLLNRMIVWKHGLLI 


3372 


A 

* 


239 


3348 

* 


PMQNCMCSLTLSVLPLGPQPPVPEKRPPEIQHFR 

MSDDVHSLGKVTSDLAKRRKLTS\*GGLSEELGS 

ARRS GE VTLTKGDPGSLEEWETV VGDDFSL YYD 

S YS VDERVDSDSKS EVEALTEQLSEEEEEEEEEEE 

EEEEEEEEEEEEEDEESGNQSDRSGSSGRRKAKK 

KWRKDSPWVKPSRKRRKREPPRAKEPRGVNGV 

GSSGPSEYMEVPLGSLELPSEGTLSPNHAGVSND 

TSSLETERGFEELPLCSCRMEAPKIDRISERAGHK 

CMATESVDGELSGCNAAILKRETMRPSSRVALM 

VLCETHRARMVKHHCCPGCGYFCTAGTFLECHP 

DFRVAHRFHKACVSQLNGMVFCPHCGEDASEA 

QEVTIPRGDGVTPPAGTAAPAPPPLSQDVPGRAD 

TSQPSARMRGHGEPRRPPCDPLADTIDSSGPSLTL 

PNGGCLSAVGLPLGPGREALEKALVIQESERRKK 

LRFHPRQLYLSVKQGELQKVILMLLDNLDPNFQS 

DQQSKRTPLHAAAQKGSVEICHVLLQAGANINA 

VDKQQRTPLMEAWNNHLEVARYMVQRGGCV 

YSKEEDGSTCLHHAAKIGNLEMVSLLLSTGQVD 

VN AQDS GG WTPII W AAEHKHIE VTRMLLTRG A D 

VTLTDNEENICLHWA SFTG S AAIAE VLLNARCDL 

HAVKVHGDTPLHIAARESYHDCVLLFLSRGANP 

ELRNKEGDTAWDLTPERSDVWFALQLNRKLRL 

GVGNRAIRTEKnCRDVARGYEKVPIPCVNGVDG 

EPCPEDYKYISENCETSTMNIDRNITHLQHCTCV 

DDCSSSNCLCGQLSIRCWYDKDGRLLQEFNKIEP 

PLIFECNQACSCWRNCKNRVVQSGDCVRLQLYR 

TAKMGWGVRALQTDPQGTFICEYVGELISDAEAD 

VREDDSYLFDLDNKDGEVYCIDARYYGNISRFIN 

HLCDPNIIPVRVFMLHQDLRFPRIAFFSSRDIRTGE 

ELGFDYGDRFWDDCSKYFTCQCGSEKCKHSAEAI 

ALEQSRLARLDPHPELLPELGSLPPVNT 


3373 


A 


587 


1584 


PDGRLrVSCSEDKTIKIWDTTNKQ j 

FANFVDFNPSGTCIASAGSDQTVKVWDVRVNKL 

LQHYQ VHSGGVNCISFHPSGNYLITAS SDGTLKIL 

DLLKGRLIYTLQGHTGPVFTVSFSKGGELFASGG 

ADTQVIXWRTbnTDELHCKGLTKRNLKRLHFDSP 

PHLLDIYPRTPHPHFF.KVETVEDFFLHLLRLIQSL 

R*SICRSLLPLLWISFLLILPQQQKPWGLCQTRV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to lost amino 
arid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanf ne OCysteine, D=Aspartic Add* 
IX3I atomic Add, F=Pheny (alanine, G=Glydoe, H«Histidine, 
I=I$oleudae, K=Lysine, L^Leudnc, M«=Methionlne, 
N=-Asparagine, P=Proline, Q^Glutamine, R-Arginine, S=Serine, 
IVTbreonine, V«Valine, W=»Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /possible nudeotide deletion, 
Vppossible nudeotide insertion 










KRPVDIS*TLP*CHQNVCQQPRKRKQKT*VTSPV 

KVKA^SIPLAVTDALEHIMEQLNVLTQTVSILEQR 

LTLTEDKLKDCLENQQKLFSAVQQKS 


3374 


A 


398 


21 


WLYPN1ALSILDIKMSPSWYFHMAIGIINWNTTAG 
LSGTLYPKVPQKYELFDSVILLLGMLRKIRQVCQ 
NVYMKGCSP1TLFKIVHYWPGAVAHAYNPSTLG 
GQVG/WQIT* GQEFETSLD YMVKPHL Y 


3375 


A 


3 


1051 


VPTOOILAFPEOTNTKDWTVTPEHVLPESOSLLT 

FEEVAMYFSQEEWELLDPTQKAJLYNDVMQENY 

ETVISLALFVLPKPKVISCLEQGEEPWVQVSPEFK 

DSAGKSPTGLKLKNDTENHQPVSLSDLEIQASAG 

V1SKKAKVKVPQKTAGKENHFDMHRVGKWHQ 

DFPVKIORKKLSTWKQELLKLMDRHKKDCAREK 

PFKCQECGKTFRVSS\DL\IKHQRIHTEEKPYKCQ 

QCDKKFRWSSDLNKHLTTHQGIKPYKCSWGGKS 

FSQNTNLHTHQRTHTGEKPFTCHECGKKFSQNS 

HLIKHRRTHTGEQPYTCSICRRNFSRRSSLLRHQK 

LHL*REACPVSHFWKTF 


3376 

■ 


A 


137 


2329 


SFESPAPLPSTCFPQERQDPGPCYVSGAMAGLGP 

GVGDSEGGPRPLFCRKGALRQKVVHEVKSHKFT 

ARFFKQPTFCSHC1 UFIWGIGKQGLQCQ VCSFW 

HRRCHEFVTFECPGAGKGPQTDDPRNKHKFRLH 

SYSSPTFCDHCGSLLYGLVHQGMKCSCCEMNVH 

RRCVRSVPSLCGVDHTERRGRLQLEIRAPTADEI 

HVTVGEARNLIPMDPNGLSDPYVKLKLIPDPRNL 

TKQKTRTVTCATLNPVWNETFVFNLKPGDVERRL 

SVEVWDWDRTSRNDFMGAMSFGVSELLKAPVD 

GWYKLLNQEEGEYYNVPVADADNCSLLQKFEA 

CNYPLELYERVRMGPSSSPIPSPSPSPTOPKRCFFG 

ASPGRLHISDFSFLMVLGKGSFGKVMLAERRGSD 

ELYA1KILKKDVIVQDDDVDCTLVEKRVLALGG 

RGPG GRPHFLTQLHSTFQTPDRL YF VMEYVTGG 

DLMYHIQQLGKFKEPHAAFYAAEIAIGLFFLHNQ 

GDYRDLKLDm^MLDAEGHIKITDFGMCKENVFP 

GTTTRTFCGTPDYIAPEIIAYQPYGKSVDWWSFG 1 

VIXYEMLAGQPPFDGEDEEELFQAIMEQTVTYP 

KSLSREAVAICKGFLTKHPGEAPGASGP*WGNLT 

IRAHGFFPLGFDWERLERL\EIPASFSRPRPCGPQR 

RGIFDKFFTRAAPAXLTPPARLVLDSIDQADFQGF 

TYVNPDFVQPDARSPTSTVHVPVM 


3377 


A 


918 


738 


SSMLWGFSVFRRSWILNCWLSSSQVGISAACKFS 
TLTHTHTHTHTHTRHAPFCGTCLYY 


3378 


A 


1126 


456 


FSKLIMKTniGISGVTNSGKTTlJUa^QKHLPNC 

SVISQDDFFKPESEDETDKNGFLQYDVLEALNME 

KMMSAISCWMESARHSWSTDQESAEEIPIHIEG 

FLLFNYKPLDTIWNRSYFLTIPYEECKRRRSTRVY 

QPPDSPGYFDGHVWPMYLKYRQEMQDITWEW 

YLDGTKSEEDLFLQVYEDHQELAKQKCLQVTA* 

RRNTTNPS/CK*IRKLQGVI 


3379 


A 


1126 


456 


FSKLIMKTFIIGISGVTNSGKT1I-AKNLQKHLPNC 

SVISQDDFFKPESEIETDKNGFLQYDVLEALNME 

KMMSAISCWMESARHSVVSTDQESAEEIPELIIEG 

FLLFTfYXPLDTIWNRSYFLTIPYEECICRRRSTRVY 

QPPDSPG YFD GHVWPMYLKYRQEMQDIT WE W 

YLDGTKSEEDLFLQVYEDLIQELAKQKCLQVTA* 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

nucleotide 

location 

corresponding 

to last amino 

add residue of 

Denticle 

sequence 


Amino acid sequence (A^AIanlne OCysteinc, D»Aspartic Acid, 
E=Glutamic Acid, F»Pbeoylalanine, G-^ Glycine, H»Histidine, 
I=Iso!eudne, K=Lysine, L^Leudne, M=Methionlne, 
NsAsparaglne, P=ProlIne, Q=Glutamine, R»Arglnlne, S=Scrine, 
T^Threonine, V»Vallne, W«Tryptophan, Y«Tyrosine, 
X-Unknown, *=Stop codon, A=possible nucleotide deletion, 

niirl^ntirl* IntrrHnn 
lr*JHJ»31Vlv uuucuuuc luscruuii 










RRNTTNPS/CK*IRKLQGVI 


.3380 


A 


1443 


794 


ARRGELAGGGRASGGRSGGDGGGGGGARAPEG 

VRAPAAGQPRATKGAPPPPGTPPPSPMSSAIERKS \ 

LDPSEEPVDEVLQIPPSLLTCGGCQQNIGDRYFLK 

AIDQYWHEDCLSCDLCGCRLGEVGRRLYYKLGR 

KLCRRDYLRLFGQDGLCASCDKRIRAYEMTMRV 

KDKVYHIJECI^CAACQKHFCVGDRYLLINSDIV 

CEQDIYEWTKINGMI 


3381 


A 

» 


945 


474 


SLKLRKPPLPTDGVHFVFVESQLDFWGPQEMLT 
QQGMALQNYDNKLVKC1EELCQKQEELCWQIQ 
QEEDKKQRLQNE VRQLTEKLA C VNEKLARVNE 
NLARKJASCSKFY'QTIAETEATYLKILESF*\TLLS 
VRKREAGNLTKATAPDQKSSGGRDS 


3382 


A 

* 


1 


1458 


GIRGKMADRGGVGEAAAVGASPASVPGLNPTLG 

WRERLRAGLAGTGASLWFVAGLGLLYALRIPLR 

LCENLAAVTVFLNSLTPKFYVALTGTSSLISGLIFI 

FEWWYFHKHGTSFIEQVSVSHLQPLMGGTESSIS 

EPGSPSRNRENETSRQNLSECKVWRNPLNLFRGA 

EYRRYTWVTGKEPLTYYDMNLSAQDHQTFFTC 

DTDFLRPSDTVMQKAWRERNPPARIKAAYQALE 

LN/E*LCHCICSTG*GRSNNYCRC*KVI*TGTQGR 

KNNL* AVTA VPAPKSSA* SSTEERYQCTGIY*LKI 

GNVCKKIRKNKRSSKNNERFDE* ISSS YHVEHP* 

KSLXKSLLELQAYPDVQAVLAKYDDISLPKSAAIC 

YTAALLKTRTVSEKFSPETASTRGLSAAEINAVD 

AIHRAVEFNPHVPKYLLEMKSLILPPEHILKRGDS 

EAIAYAFFHLQHWKRIEGALNLLQCTWEGSKYS 

FPKVTLISLTIH 


3383 


A 


282 

■ 


2443 


RGKGFKEFFLGVCQTFIPCLCAEGIQLQFFCSGSG 

SSPLLKDLESMKTGLFFLCLLGTAAAIPTNARLLS 

DHSKPTAETVAPDNTAIPSLRAEAEENEKETAVS 

TEDDSHHKAEKSSVLKSKEESHEQSAEQG\KSS\S 

QELGIEGFKRDSDGSL*VWNL\EYGTNLKGTLDI 

KEDMSEPQEKKLSENTDFLAPGVS SFTDSNQQES 

ITKREENQEQPRNYSHHQLNRS SKH SQGLRDQG 

NQEQDPNISNGEEEEEKEPGEVGTHNDNQERKTE 

\LPREHANSKQEEDNTQSDDILEESDQPTQVSKM 

QEDEFDQGNQEQEDNSNAEMEEENASNVNKHIQ 

ETEWQSQEGKTGLEAISNHKETEEKTVSEALLME 

PTDDGNTTPRNHGVDDDGDDDGDDGGTDGPRH 

SA\SDD YFHPKPGLF WEAERAVHSIA YSPSKLREQ 

REKVHENENIGTTEPGEHQEAKKAENSSNEEETS 

SEGNMRWHAVDSCMSFQCKRGHICKADQQGKT 

SLVSCQDPVTVCPPTKPLDQVCGTDNQTYASSCH 

LFATKCRLEGTKKGHQLQLDYFG\ASKSIPT\CRD 

FEVIQ\FPLRMRDW\LKNILMQLYEANSEHAGYL 

NEK\QRNKVKKIYL\DEKRLLAGDHPEDLLLRDFK 
KNYHMYVYPVHWQFSE1X)QHPMDRVLTHSELA 
PLRASLVPMEHCITRFFEECDPNKDKHITLKEWG 
HCFGDCEEDIDENLLF 


3384 


A 


3166 


928 


PSRPHPTHAAMAGPEGFQYRALYPFRRERPEDLE 

LLPGDVLWSRAALQALGVAEGGERCPQSVGW 

MPGLNERTRQRGDFPGTYVEFLGPVALARPGPR 

PRGPRPLPARPRDGAPEPGLTLPDLPEQFSPPDVA 

PPLLVKLVEAIERTGLDSESHYRPELPAPRTDWSL 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A-Alanine OCystdne, D=»Aspartic Add, 
E=Glutamic Add, ^Phenylalanine, G=Glydne, H=Histiduie, 
l=Iso leu cine, K-Lysine, L^Lcudne, M^Methionine, 
N^Asparagine, P=ProHne, Q-Glutamine, R^Arginine, S^Serine, 
T«Threonine, V»Valine, W~Tryptophao, Y^Tyroslne, 
X°Unknown, *=Stop cod on, ^possible nudeotide deletion, 
^possible nudeotide insertion 










SDVDQWDTAALADGIKSFLLALPAPLVTPEASAE 

ARRALREAAGPVGPALEPPTLPLHRALTLRFLLQ 

HLGRVASRAPALGPAVRALGATFGPLLLRAPPPP 

SSPPPGGAPDGSEPSPDFPALLVEKLLQEHLEEQE 

VAPPALPPKPPKAK\PASTVPGPNGGSPPSL\QDA 

EWYWGDMSREEVNEKLRDTPDGTFLVRDASSKI 

QGEYTLTLRKGGNNKLIKVFHRDGHYGFSEPLTF 

CSVVDLINHYRHESLAOYNAKLDTRLLYPVSKY 

QQDQIVKEDSVEAVGAQLKVYHQQYQDKSREY 

DQLYEEYTRTSQELQMKRTAJDBAFNK11KIFEEQG 

QTQEKCSKEYLERFRREGN/QTKEMQRILLNSER 

LKSRIA\EIHESRT\KL\EQQLL VPRA SDNKRD/IDK 

PH*TSLKPDLMQLRKIRDQYLVWLTQKGARQKK 

IhmWLGIKNETEDQYALMEDEDDLPHHEERTWY 

VGKINRTQAEEMLSGKRDGTFLIRESSQRGCYAC 

SVVVDGDTKHCVIYRTATGFGFAEPYNLYGSLK 

ELVLHYQHASLVQHNDALTVTLAHPVRAPGPGP 

PPAAR 


3385 


A 

• 


43 


2372 


TRDVNSWK^LCFNHYNKETTNCYRTTRKWTNY 

KIIFLGPFRELRSQGNQV1LNLGKERCQLRETGLK 

L YLPGMDS ARHHISH STS AGPIPSQKEEEMTESQ 

GTVTFKDVAIDFTQEEWKRLDPAQRKLYRNVML 

♦NYN>OJTVGYPFTKPDVIFKLEQEEKPWVMEEE 

VLRRHWQGEIWGVDEHQKNQDRLLRQVEVKFQ 

KTLTEEKGNECQKKFANVFPLNSDFFPSRHNLYE 

YDLFGKCLEHNFDCHNNVKCLMRKEHCEYNEP 

VKSYGNSSSHFVITPFKCNHCGKGFNQTLDLIRH 

LRIHTGEKPYECSNCRKAFSHKEKLIKHYKIHSRE 

QSYKO^ECGKAFIKMSNLIRHQRIHTGEKPYACK 

ECEKSFSQKSNLIDHEKIHTGEKPYECNECGKAFS 

QKQSLIAHQKVHTGEKPYACNECGKAFPR1ASLA 

LHMRSHTGEKPYKCDKCGKAFSQFSMLIIHVRIH 

TGEKPYECNECGKAFSQSSALTVHMRSHTGEKP 

YECKECRKAPSHKXNFITHQKIHTREKPYECNEC 

GKAFIQMSNLVRHQRIHTGEKPYICKECGKAFSQ 

KSNLIAHEKfflSGEKPYECNECGKAFSQKQNFIT 

HQKVHTGEKPYDCNECGKAFSQIASLTLHLRSHT 

GEKPYECDKCGKAFSQCSLLNLHMRSHTGEKPY 

VC^CGKAFSQRTFLIVHMRGHTGEKPYECNEC 

GKAFSQSSSLTIHIRGHTGEKPYECKECRKAFSHK 

KNFllHQKIHTRE/KPI^aWCGKGFNQTLDLIRH 

LRIHTGEKPYECSNORKAFSHKEKIJKHYKIHSRE 

QSYKCNECGKAFIKMSNLIRHQRIHTGEKPYACK 

ECEKSFSQKSNLIDHEKIHTGEKPYECNECGKAFS 

QKQSLIAHQKVHTGEKPYACNECGKAFPR1ASLA 

LHMRSHTGEKPYKCDKCGKAFSQFSMLI1HVRIH 

TGEKP YECNECGKAFS QS S ALTVHMRSHTGEKP 

YECXECRKAFSHKK^rraQKIHTREKPYEOlEC 

GKAFIQMSNLVRHQRIHTGEKPYICKECGKAFSQ 

KSNL1AHEKIHSGEKPYECNECGKAFSQKQNFIT 

HQKVHTGEKPYDCNECGKAFSQIASLTLHLRSHT 

GEKPYECDKCGKAFSQCSLLNLHMRSHTGEKPY 

VChffiCGKAFSQRTFLIVHMRGHTGEKPYECNEC 

GKAFSQS SSLTIHIRGHTGEKP YECKECRKAFSHK 

KmTHQKIHTRENPLSVnVEKASIRLWTSSDI 
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5EQ ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first nmino 
| add residue of 
nentldfc 

sequence 


Predicted end 

nucleotide 

location 

corresponding 

to last amino 

acid residue of 

peptide 


Amino acid sequence (A«Alaoine OCystdnc, D-Aspartic Add, 
E=Clutnmic Acid, F^Pbeny lain nine, G**Grycine, H»Histidine, 
I-Isoleucine, K^Lysine, L= Leucine, CVt-Mcthionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S-Serinc, 
T^Thrconlnc, V«*VaIine, W^Tryptophao, Y-Tyroslne, 
X»Un known, *= s Stop codon, /^possible nucleotide ddetion, 
V=possIbIe nucleotide insertion 


3386 


A 


201 


1032 


WDDYPQGALRRREAAEGLHFLGPPGRVRGQLR 

GrTGPAWYCHSPSHSLLSAFCHLPTPSRCPAMAR 

PPVPGSVWPNWHES/RRGQGVPGLHSAQEPPAG 

V WAA * AAS AAAAVLSIDTAS YKIFVSGKSG VGKT 

ALVAKI^GLEWVVHHETTGIQTTVVFWPAKLQ 

ASSRVVMFRFEFWDCGESALKKFDHMLLACME 

NTDAFLFLFSFTDRASFEDLPGQLARIAGEAPGV 

VRMVIGSKFDQYMHTDVPERDLTAFRQAWELPL 

LRVKSVPGRRLG 


3387 


A 


86 


96 


GSSPDPASL1TMKNQDKKNGAAKQSNPKSSPGQP 

EAGPEGAQERPSQAAPAVEAEGPGSSQAPRKPEG 

AQARTAQSGALRDVSEELSRQLEDILSTYCVDNN 

QGGPGEDGAQGEPAEPEDAEKSRTYVARNGEPE 

PTPWNGEKEPSKGDPNTEEIRQSDEVGDRDHRR 

POEKKKAKGLGK£riT^LMOTL.NTLSTPEElCL A A I 

CKKYAELLEEHRNSQKQMKLLQKKQSQLVQEK 

DHLRGEHSKAVLARSKLESLCRELQRHNRSLKE 

EGVQRAREEEEKRKEVTSHFQVTLNDIQLQMEQ 

HNERNSKLRQENMELAERLKKLEEQYELREEHID 

KVFKHKDLQQQLVDAKLQQAQEMLKEAEERHQ 

REKJDFLLKEAVESQRMCELMKQQETHLKQQLA 

LYTEKFEEFQNTLSKSSEVFTTFKQEMEKMTKKI 

KKLEKETTMYRSRWESSNKALLEMAEEKTVRD 

KELEGLQVKIQRLEKLCRALQT/GAQ*PVRGQRW 

GSHRTSAVRIFS 


3388 

* 


A 


98 


3197 

• 

• 


ARPEVP APPA WLSRRG A AKMGDKKDDKDSPKK 

NKGKERRDLDDLKKEVAMTEHKMSVEEVCRKY 

NTDCVQGLTHSKAQEILARDGPNALTPPPTTPEW 

VKFCRQLFGGFSILLWIGAILCFLAYGIQAGTEDD 

PSGDNL YLGIVLAA WITTGCFS YYQEAKS SKIME 

SFKNMVPQQALVTREGEKMQVNAEEVVVGDLV 

EDCGGDRVPADLRnSAHGCKVDNSSLTGESEPQT 

RSPDCTHE\NPLKTRNITFFSNNFVEGTARGVVVA 

TGDRTVMGR1ATLASGLEVGKTPIA1EIEHFIQLIT 

GVAVFLGVSFFILSLELGYTWLEAVIFLIGnVANV 

PEGLLATVTVCLTLTAKRMARKNCLVKNLEAVE \ 

TLGSTSTICSDKTGTLTQNRMTVAHMWFDNQIH 

EADTTEDQSGTSFDKSSHTWVALF*H/LLGFCNR 

PVFKGGQDNIPVLKRDVAGDASESALLKCIELSS 

GSVKLMRERNKKVAEIPFNSTOKYQLSIHETEDP 

NDNRYLLVMKGAPERIIX)RCSTILLQGKEQPLDE 

EMKEAFQNAYLELGGLGERVLGFCHYYLPEEQF 

PKGFAFT)CDDVNFTTDNLCFVGLMSMIGPPRAA 

VPDAVGKCRSAGDCVIMVTGDHPITAKAIAKGV 

GIIFEGNETVEDIAARLNIPVSQVNPRDAKACVIH 

GTDLKDFTSEQIDEILQNHTEIVFARTSPQQKLIIV 

EGCQRQGAIVAVTGDGVNDSPALKKADIGVAM 

GIAGSDVSKQAADMILLDDNFASIVTGVEEGRH 

FDNIJKJCSIAYTLTSNIPEITPFIXFIMANIPLPLGTI 

mCIDLGTDMWAISI^YEAAESDIMKRQPRNPR 

TDKLVNERLISMAYGQIGMIQALGGFFSYFVILA 

ENGFLPGNLVGIRLNWDDRTVNDLEDSYGQQW 

TYEQRKVVEFTCHTAFFVSIVVVQWADLUCKTR 

RNSVFQQGMKNKILIFGLFEETALAAFLSYCPGM 

DVAJLRMYPLKPSWWFCAFPYSFLIFVYDEIRKLI 
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SEQID 
NO: 


Method 


Predicted 

beginniog 

nucleotide 

location 

corresponding 

to first amino 

add residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A*=Alanine OCysteine, D-Aspartic Add, 
E=Glutamic Acid, F»Phenyin!anine, G=Glycine, H»Histidine, 
I=Isoleudne, K=Lyslne, L=Leudne, M^Methionine, 
N=»Asparagine, P»Prollne, Q=Clutaralnc, R»Arginine, S=Serine, 
T«Threonine, V^Valine, W^Tryptophan, Y-Tyroslne, 
X«Unknown, * t =Stop codon, ^possible nudeotide deletion, 
V=T)ossibIe nudeotide insertion 










LRRNPG G WVEKETYY 


3389 

• 


A 


45 


5250 

• 


VERLLGCRNSKRTWRjMLISKhTMPWRRLQGISFG 

MYSAEELKKLSVKSITNPRYLDSLGNPSANGLYD 

LALGPADSKEVCSTCVQDFSNCSGHLGHIELPLT 

VYNPLLFDKLYLLLRGSCLNCHMLTCPRAVIHLL 

LCQLRVLE VG ALQA VYELERILNRFI F.KNPDPSA 

SEIREELEQYTTEIVQNNLLGSQGAirVKNVCESK 

SKLIALFWKAHMNAKRCPHCKTGRSWRKEHNS 

KLTITFPAMVHRTAGQKDSEPLGIEEAQIGKRGY 

LTPTSAREHLSALWKNEGFFLNYLFSGMDDDGM 

ESRFNPSVFFLDFLVVPPSRYRPVSRLGDQMFTN 

GQTVNLQAVMKDVVLIRKLLALMAQEQKLPEE 

VATPTTDEEKDSL1A1DRSFLSTLPGQSLIDKLYNI 

WIRLQSHVNIWDSEMDKLMMDKYPGIRQILEK 

KEGLFRKHMMGKRVDYAARSVICPDMYTNTNEI 

GIPMVFATKLTYPQPVTPWNVQELRQAVINGPN 

VHPGASMVINEDGSRTALSAVDMTQREAVAKQ 

LLTPATGAPKPQGTKIVCRHVKNGDILLLNRQPT 

LHRPSIQAHRARILPEEKVLRLHYANCKAYNADF 

DGDEMNAHFPQSELGRAEAYVLACTDQQYLVP 

KDGQPLAGLIQDHMVSG A SMTTRGCFFTREHYM 

ELVYRGLTDKVGRVKLLSPSILKPFPLWTGKQVV 

STLLINnPEDHIPLNLSGKAKITGKAWVKETPRSV 

PGFNPDSMCESQVIIREGELLCGVLDKAHYGSSA 

YGLVHCCYEIYGGETSGKVLTCLARLFTAYLQL 

YRGFTLGVEDELVKPKADVKRQRIIEESTHCGPQ 

AVRAALNLPEAASYDEVRGKWQDAHLGKDQRD 

FNMIDLKFKEEVTWYSNEINKACMPFGLHRQFPE 

NTLQLMVQSGAKGSTVNTMQISCLLGQIELEGRS 

TPLMA SGKSLPCFEPYEFTPRAGGFVTGRFLTGIK 

PPEFFFHCMAGREGLVDTAVKTSRSGYLQRCIIK 

HLEGL WQ YDLTVRDSDG S WQFLYGEDGLDIP 

KTQFLQPKQFPFLASNYEVIMKSQHLHEVLSRAD 

PKKALHHFRAIKKWQSKHPNTLLRRGAFLSYSQ 

KIQEAVKALKLESENRNGR/RPWDS/G/RMLRMW 

YELDEESRRKYQKKAAACPDPSLSVWRPDIYFAS 

VSETFETKVDDYSQEWAAQTEKSYEKSELSLDR 

LRTLLQL\K WQRSLCEPGEA VGLLAAQ SIGEPST 

QMTLNTFHFAGRGEMKVTLGIPRLREILMVASA 

NIKTPMMSWVLhTIlCKALKRVKSLKKQLTRVCL 

GEVLQKIDVQESFCMEEKQNKFQVYQLRFQFLP 

HA YYQQEKCLRPEDILRFMETRFFKLLMES IKKK 

NNKASAFRNVNTRRATQRDLDNAGELGRSRGE 

QEGDEEEEGHIVDAEAEEGDADASDAKRKEKQE 

EEVDYESEEEEEREGEENDDEDMQEERNPHREG 

ARKTQEQDEEVGL/GH*GGPVPSRPPDAAPETHP 

QPG APG A\EAMERR VQA VREIHPFIDD YQ YDTEE 

SLWCQVTVKLPLMKJWFDMSSLVVSLAHGAVIY 

ATKGITRCLLNETTNNK>fEKELVLNTEGIX^LPELF 

KYAEVLDLRRLYSNDIHAIANTYGIEAALRVIEK 

EIKDVFAVYGIAVDPRHLSLVADYMCFEGVYKP 

LNRFGIRSNSSPLQQMTFETSFQFLKQATMLGSH 

DELRSPSACLWGKWRGGTGLFELKQPLR 


3390 


A 


2 


2080 


ILPPLEGPPAQASPSSTMLGEGSQPDWPGGSRYD 
LDEIDAYWLELINSELKEMERPELDELTLERVLE 
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SeOId 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCystdne, D=>Aspartic Acid, 
EXJlutaraic Acid, F=Phenylnlanine, G=GIycine, H«Hlstldf ne, 
Islsoleucine, K=Lysinc, L^Leudne, M=Methionine, 
N=Asparaginc, P«ProIine, Q=Glutamine, R«Arginlne, S=Serine, 
T»Threoninc, V«Vaiine, W^Tryptophan, Y«^Tyroslne, 
X=Un known, *«Stop codon, /=posslble nudeotide deletion, 
\=»possible nudeotide insertion 










ELETLCHQNMARAJETQEGLGEEYDEDWCDVC 

RSPEGEDGNEMVFCDKCNVCVHQACYG13LKVPT 

GSWLCRTCALGVQPKCLLCPKRGGALKPTRSGT 

KWVHVSCALWIPEVSIGCPEKMEPITKISH1PASR 

WALSCSLCKECTGTCIQCSMPSCWTAFHVTCAF 

DHGLEMRHLADNDEVKFKSFCQEHSDGGPRNE 

PTSEPTEPSQAGEDLEKVTLRKQRLQQLEEDFYE 

LVEPAEVAERLDLAEALVDFIYQYWKLKRKANA 
NOPLLTPKTDEVDNLAOOEODVLYURI KJ FTHT 

RQDLERVRNLC YMVTRRERTKHA I CKLQEQIFH 

LQMKLIEQDLCRAGLSTSFPIDGTFFNSWLAQSV 

QITAENMAMSEWPLNNGHREDPAPGLLSEELLQ 

DEETLLSFMRDPSLRPGDPARKARGRTRLPAKK 

KPPPPPPQDGPGSRTTPDKAPKKTWGQDAGSGK 

GGQGPPTRKPPRRTSSHLPSSPAAGDCPILATPES 

PPPLAPETPDEAASVAADSDVQVPNGPAASPKPLG 

RLRPPPREPR*T\RRLPGC/ARPDAGDGDHLSAVA 

ERPKV\SLHFD'rh M l'DG\YFS\DGEMSNS\DV\EAED 

GGVQRGPREAGAKEXWRMGVLAS 


3391 

* 


A 


1555 


327 

i 


NSFLHFLHLKVRTMFLFPSFPVLLLSVVTASCSKT 

KACADTQKTCSMITCGIPVTNGTPGRDGRDRPK 
GEKGEPGLGOVSVAS*ISTSGRCSSKSVLFPATRCi 

l10xrlgeaplssgpmlhseqpl*nalasktklfv 

dslgshistqelgvcgcpfrgvsclvgelalvqa 

lh*vagesfffgsdhwligcaggeqewsiei:lgk 

kkrvtatgssslclatgqglrglqgppgkmgpp 

gntgtsgipgprgqkgdrgdnsvaeaklanler 

KL* slrseldhtkkl*pfslgk\msgkklfvtnge 

RMPFSKVKALCAGLQAWAAPKNAEENKAIQDV 
AKDTAl^GITDEATEGQFMYLTGGRLTYSNWKK 
DEPNDHGSGEDCVILLNNGLWNGISCTSSFIAICE 
FPA 


3392 


A 


218 

» 


1773 


GGSRIWQIUISIPVLGYFLKQKKMTKAQESLTLE 

DVA\ODFTWEEWQFLSPAQKX>LYRDVMLENYSN 

LVSVGYQAGKPDALTKLEQGEPLWTLEDEIHSP 

AHPEIEKADDHLQQPLQNQKILKRTGQRYEHGR 

TLKSYLGLTNQSRRYNRKEPAEFNGDGAFLHDN 

IffiQMPTEIEFPESRXPISTKSQFLI^ 

OTCTT)CGKA1^KJKSQLTEHKJRJ 

CGKAFYKKYRLTEHERAHRGEKPHGCSLCGKAF 

YKRYRLTEHERAHKGEXPYGCSECGKAFPRKSE 

LTEHQRIHTGIKPHQCSECGRAFSRKSLLVVHQR 

THTGEKPHTCSECGKGFIQKGNLN1HQRTHTGEK 

PYGCroCGKAFSQKSCLVAHQRYHTGKTPFVCPE 

CGQPCSQKSGLIRHQKIHSGEKPYKCSDCGKAFL 

TKTMLIVHHRTHTGEl^YGCDECEKAYFYMSCL 

VKHKRIHSREKRGD/CSEGGKSFHSKSQLKS* ♦TC 

AGEKPC*YGNCGNGGRAV 


3393 


A 


46 


1464 


ARSLSGAPSGSSRQDGTSLLRTGAGYSSSQSIETL 

SLPPGPSHLVGDKSQGGRSCQGQITSAASGKTSK 

SEFNHVTFKIOSRDKSVTMYLG>niD 

QPVDGWLVDPDLVKGKKVYVTLTCAFRYGQE 

DIDVIGLTFRRDLYFSRVQVYPPVGAASTTTKLQ 

ESLLKIOXjSNTYPFLLTFPDYLPCSVMLQPAPQD 

SGKSCGVDFEVKAFATDSTDAEEDKIPKKSSVRL 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D=»Aspartic Add, 
E«=Glutaraic Acid, ^Phenylalanine, G=Glycine, H=Hlstidine, 
Msolencine, KpLysine, L= Leu cine, M=Met bio nine, 
N=Aspnragine, P^Proline, QMSlutamine, R=Arginine, S=Serine, 
T«Threonlne, V=Valine, \V~Tryptophan, Y«Tyrosine, 
X=Unknown, *«=Stop codon,/=possible nucleotide deletion, 
V-possible nucleotide insertion 










LIRKVQHAPLEMGPQPRAEAAWQFFMF\DKPLH 

LAVSLNKIO^LFPMGSPIPWVSVPX^TEKPVKKI 

KA\SVEQVANWLYS\SDY\YVKPVAMEEAQEKV 

PTOSTWTKA\LTLL\PWLVNNRERRGIALDGKKH 

EDTKLASSTHKEGIDRKRSWEILVSYPDQR^SSTV 

SGFLGRASPSQ* SRPT*RSQFRL\MHPQP\EDPA\K 

ESYQD ANLVFVEEFARP* ILKDAGEA*\EGKRJDQE 


3394 


A 


211 

• 


1591 


RPPTMAADQRPKADTLALRQRLISSSCRLFFPEDP 

VKIVRAQGQYMYDEQGAEYIDCISNVAHVGHCH 

PLWQAAHEQNQVLNTNSRYLHDNIVDYAQRLS 

ETLPEOLCVFYrl.NSGSEANDLAIJRLARHYTGH 

QDVVVLDHAYHGHLSSLIDISPYKFRNLDGQKE 

WVHVAPLPDTYRGPYREDHP\THVEDGLEKAFS* 

KRVVQGRNRQICRRQIAAFFAESLPSVGGQIIPPA 

GYFSQVAEHIRKAGGVFVADEIQVGFGRVGKHF 

WAFQLQGKDFVPDIATTMGKSIGNGHPVACVAAT 

QPVARAFEATGVEYFNTFGGSPVSCAVGLAVLN 

VLEKEQLQDHATSVGSFLMQLLGQQKIKHPIVG 

DVRGVGLFIGVDLKDEATRTPATEEAAYLVSRJL 

KENYVLLSTDGPGRNILKFKPPMCFSLDNARQV 

VAKLDAILTDMEEKVRSCETLRLQP 


3395 


A 


1 


1424 

* 


FRDGFSLRCGCNA FT PGRGGDDA ADRAIQRFLR I 

TGAAVRYKVMKNWGVIGGIAAALAAGIYVIWG 

PITERKKRRKGLVPGLVNLGNTCFMNSLLQGLSA 

CPAFIRWLEEFTSQYSRDQKEPPSHQYLSLTLLHL 

LKALSCQEVTDDEVLHASCLLDVLRMYRWQISS 

FEEQDAHELFHVTTSSLEDERDRQPRVTHLFDVH 

SLE\HSQK*LPKQITCRTRGSPHK1 , SNHWKSQHPF 

HGRLTSNMV CKHCEHQS P VRFDTFDSLSL S IP AA 

TWGHPLTLDHCLHHFISSESVRDVVCDNCTKJEA 

KGTLNGEKVEHQRTTFVKQLKLGKLPQCLCIHL 

QRLSWSSHGTPLKRHEHVQFNEFLMMDIYKYHL 

LGHKPSQHNPKLNKNPGPTLELQDGPGAPTPGL 

NQPGAPKTQIFMNGACSPSLLPTLSAPMPFPLPV 

VPDYSSSTYLFRLMGSCRPPWETWHSGTLCSFTD 

GPHL 


3396 


A 


109 


107 

• 


TQEAGLIFFSPPFSLSLSLSLPLSLFLLSHPHSRTPP 

NRTPRRTRIPQRPAVMYSPLCLTQDEFHPFIEALL 

PHVRAFAYTWFNLQARKRKYFKKHEKRMSKEE 

ERAVKDELLSEKPEVKQKWASRLLAKLRKDIRP 

EYREDFVLTVTGKKPPCCVLSNPDQKGKMRRID 

CLRQADKVWRLDLVMVILFKGIPLESTDGERLV 

KSPQCSNPGLCVQPHfflGVSVKELDLYLAYFVH 

AADSSQSESPSQAK*R*H*GPARKWDIWGFQ\DS 

FVT\SGVF\SVT*A*LRVSQTPI\AAG\TGPNFSLSD 

LESSSYYSMSPGAMRRSLPSTSSTSSTKRLKSVED 

EMDSPGEEPFYTGQGRSPGSGSQSSGWHEVEPG 

MPSPTTLKKSEKSGFSSPSPSQTSSLGXTAFTQHHR 

PVITGTQSKFHIATPSILVHFPRHSPFFQQPGPYFSH 

PAIRYHPQETLKEFVQLVCPDAGQQAGQPNGSS 

QGKVHNPFLPTPMLPPPPPPPMARPVPLPVPDTK 

PPTTSTEGGAASPTSPTTRS/PGRTRPQQPFL/SYG 

PP*PSNALIGGGGGGAGERAGERADLEM 


3397 


A 


1 


2002 


TGTLTEDGLD VMG WPLKG QAFLPL VPEPRRLP 
VGPIXRALATCHALSRLQDTPVGDPMDLKMVES 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


j Predicted end 
| nucleotide 

1 location 

corresponding 
j to last amino 

acid residue of 

peptide 
1 sequence 


Amino acid sequence (A«Alanine OCystdnt, D= As par He Acid, 
E=Glutamic Acid, F=PhenyIa!anine, G=Clycinc, H=Hlstidine, 
I=Isoleutine, K«Lysine, L=Leudne, M=*Methionine, 
N^Asparagine, P=Prollne, Q=Glutamlne, R«Arginine, S=Serinc, 
"P=Threonine, V^Valine, W=Tryptopban, Y-Tyroslne, 
X=Un known, *«Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










TGWVLEEEPAADSAFGTQVLAVMRPPLWEPQLQ 

AMEEPPVPVSVLHRFPFSSALQRMSWVAWPGA 

TQPEAYVKGSPELVAGLCNPETTVPTDFAQMLQS 

YTAAGYRVVALASKPLPSVPSLEAAQQLTRDTV 

EGDLSLLGLLVMRNLLKPQTTPVIQALRRTRXRA 

VMVTGDNL QTA VTVARGCGMVAPQEHLIIVHA 

THPERGQPASLEFLPMESPTAVNGVKDPDQAAS 

YTVEPDPRSRHLAI^GPTFGIIVKHFPKLLPKVLV 

QGTVFARMAPEQKTELVCELQKLQYCVGMCGD 

GANDCGALKAADVGISLSQAEASWSPFTSSMA 

SIECVPMVIREGRC SLDTSFS VFKYMALYSLTQFI 

SVLILYTfNTNLGDLQFLAIDLVri ,, riVAVLMSRT 

GPALVLGRVRPPGALLSVPVLSSLLLQMVLVTG 

VQLGGYFLTLAQPWFVPLNRTVAAPDNLPNYEN 

TVVFSLSSFQYLILAAAVSKGAPFRVRPLTNNVPF 

LLASAL* SS VL WLVLSPGLLHGPLALRNITDTGF 

KLLLVGLVTLNFVGGLHAGERARPVPPRLPAPPP 

AQAG\SKKRFKQLERELAEQPWPPLPAGPLR 


3398 


A 


758 


1368 


FPFRMLTGYLYLMWRRKAFWSGTQRHPLPGGL 

KRRRRPGRGPWPAPGGQGVGPSAL*KAGSPPAN 

RPGQGE/TCLISPKPVTEVLPDVQGAPVPVPPLPT 

PPSLPHLQNQPP/TVQHYLLSFSWKPSQGPE*RA* 

PSPLPPAAMRPDG*PGPASQGPDQPG\PCPPASLP 

TSPPGKGFQK1E IRKHPPPRQQHKPKCTANRPLA 

SFL 


3399 


A 


906 


1091 


HHHHHHHHHHHHHLVAFGKVQ*LQNSPSSSSSS 
SSGCFWQARFSSYRTLHHHHHHHHHHHHH 


3400 


A 


1838 


325 


PFLSVHRSPHGPSKLCDDPQASLVPEPVPGGCQE 

PEEMSWPPSGEIASPPELPSSPPPGLPEVAPDATST 

GLPDTPAAPETSTNYPVECTEGSAGPQSLPLPILE 

PVKNTCSVKIXJTPLQLSVEDTTSPNTKPCPPTPTT 

PETSPPPPPPPPSSTPCSAHLTPSSLFPSSLESSSEQ 

KFYNFVI1JIARADEHIALRVSGRSWEALGVPDG 

ATFCEDFQVPGRGELSCLQDAIDHSAFIILLLTVSN 

\FDCR\LSLHQVNQAMMSNLT\RQGSQDCVIP\FLP 

\LESSPARLSSDTASLLSGLVRLDEHSQIFARKVA 

NTFKPHRLQARKAMWRKEQDTRALREQSQHLD 

GERMQAAALNAAYSAYLQSYLSYQAQMEQLQV 

AFGSHMSFGTGAPYGARMPFGGQVPLGAPPPFP 

TWPGCPQPPPLHAWQAGTPPPPSPQPAAFPQSLP 

FPAVPKPFPTASTAPPSEPKGWQPVLIIHHAQMVT 

SWG*NKH\MWNQRGSQAPEDKTQEAE 


3401 


A 


.153 


1389 


EWGWLGAAQPPEEEAEAEDQESPSSLCREALAEI i 

KKEISPLFIGMEKCSVGGLELTEQTPALLGNMAM 

ATSLMDIGDSFGHPACPLVSRSRNSPVEDDDDDD i 

DWFIESIQPPSISAPAIADQRNFIFASSKNEKPQG 

NYSVIPPSSRDLASQKGNISETIVIDDEEDIETNGG 

AEKKSSCFffiWGLPGTKNKTNDLDFSTSSLSRSK 

VNAGMGNSGITTELTIJCYnTNVTTLETGISSVNA 

GQDVNniTYKTSL*NTNLGDVAKGLQSSNFGVNI 

QTYTPSLTPQTKTGV\NIXTLVE*MWQETYFRME 

NLQLII/CPEDASTKKANVILPVESSKSFQEFYSTS 

CLSPCENNWNLKKGVI^SRCTICSKl^WIFI 

PKLLFRLTVIILTFKCYYVLFHLHNARVLDV 


3402 


A 


153 


1389 


EWGWLGAAQPPEEEAEAEDQESPSSLCREALAEI 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, Lh-Aapartic Acid, 
E=Glutaralc Acid, F-Pheny tela nine, G<=Glyclne, H^Histidine, 
I^Isoleucind K E Lysine, L>-Lc urine, M>=Methionine, 
N^Asparagine, P^Froline, Q=Glutamine, R-Arginine, S=Serinc, 
T^Threonine, V-Valine, W«Tryptophan, Y^Tyrosine, 
X»Unknown, *«Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 






< 




KKEISPLFIGMEKCSVGGLELTEQTPALLGNMAM 

ATSLMDIGDSFGHPACPLVSRSRNSPVEDDDDDD 

DVVFIESIQPPSISAPA1ADQRNFIFASSKNEKPQG 

NYSVTPPSSRDLASQKGNISETTVIDDEEDIETNGG 

AEKKSSCFIEWGLPGTKhfKTNDLDFSTSSLSRSK 

VNAGMGNSGITTELTLKYlIThrVTTLETGISSVNA 

GQDVNUITYKTSL*KINLGDVAKGLQSSNFGVNI 

QTYTPSLTPQTKTG WNLLTL VE * MWQETYFRME 

NLQLII/CPEDASTKKANVILPVESSKSFQEFYSTS 

CLSPCENNWNLKKGVFNKSRCTICSKLAEVWIFI 

PKLLFRLTVIILTFKCYYVLFHLHNARVLDV 


3403 


A 


609 


2765 

• 


SRHCTPAERQNETHRAPDFAMSAVLGHQPPFFPA 

LTLPPNGAAALSLPGALAKPIMDQLVGAAETGIP 

FSSLGPQAP1LRPLKTMEPEEEVEDDPKVHLEAKE 

LWDQFHKRGTEMVITKSGRRMFPPFKVRCSGLD 

KKAKYILLMDIIAADDCRYKFHNSRWMVAGKA 

DPEMPKRMYIHPDSPATGEQWMSKVVTFHKLKL 

TNNISDKHGFTILNSMHKYQPRFHTVRANDILKLP 

YSTFRTYLFPETEFIAVTAYQNDKITQLKIDNNPF 

AKGFRDTGNGRREKRKQLTLQSMRVFDERHKK 

ENGTSDESSSEQAAFNCFA\QASSPAA\PL*RTSNL 

KJDRSPSRG*RA I'PKAEEQRGSTAPRPATRAKISP 

HPRRRSPAVTRAAPAVKAHLFAAERPRDSGRLD 

AAA 1 T 1 1 1 1^ f 1 M m V A A* • A ~ A^^* AA AM A m A4 UvA\A A ^Jm^ fc^ A ^^bW Aj^ 

KASPDSRHSPATISS STRGLGAEERRSPVREGVQA 

PAKVEEARALPGKEAFAPLTVQTDAAAAJHJLAQG 

PLPGLGFAPGLAGOOFFNGHPLFLHPSOFAMGG 

AFSSMAAAGMGPLLATVSGASTGVSGLDSTAM 

ASAAAAQGLSGASAATLPFHLQQHVLASQGLA 

MSPFGSLFPYP YTYMAAAAAA/S SAAAS AS VHRT 

P\rmNTMRPRLRYSP YSIPVP VPDG S SLLTTALPS 

MAAAAGPLDGKAAALAASPASXVAVDSGSELNS 

RSS\TLSSSSMSLSPKLCAEKEAATSELQSIQRLVS 

GLEAKPDRSRS A SP 


3404 


A 


1082 


1308 


LKKFLEVPQSYSLLJLSSPFLQ\WRA*RPQNAIG*Q 

FIIKTLVFFG1MRSAGDVLSTQVSCALRIMRTAGC 

SHSSP 


3405 


A 


1553 


559 


PRPPTQRLSRFAPPCRTAEFPFRRRAWTRPAPPR 

ACTWGRSSPVTGLAVGAAVAMLTVAARSRPFA 

PVLSATSRGVAGALT\P*MQATVPATPEQPVLJ>L 

KRPFLSRESLSGQAVRRPLVASVGLNVPASVCYS 

HTDDCVPDFSEYRRLEVLDSTKSSRESSEARKGFS 

YLVTG V Tl VG VA Y AAKNAVTQFVSSMSASADV 

LAIJVKffilKLSDIPEGfcNMAFKWGKPLFVRHRT 

QKEffiQEAAVELSQLRDPQHDLDRVKKPEWVILI 

GVCimGCWIANAGDFGGYYCPCHGSHYDASG 

RIRLGPAPLNLEVPTYEFTSDDMV1VG 


3406 


A 


83 


2671 


CLYPDFCRSVTCAMPCFTHRSCREDPGTSESREM 

DPVAFKDVA\nsnFTQEEWALLDISQKNLYREVML 

ETFWNLTSIGKK WKD Q>HE YEYQKPRRNFRS VT 

EEKVNEIKEDSHCGETFTPVPDDRLNFQKKKASP 

EVKSCDSFVCEVGIX3NSSSNMNIRGDTGHKACE 

CQEYGPKPWKSQQPKKAFRYHPSLRTQERDHTG 

!OCPYACKECGK>mYHSSIQRHMVVHSGDGPYK 

CKFCGKAFHWLSLY1JHERTHTGEKPYECKQCG 

KSFSYSATHRIHERTH1GEKPYECQECGKAFHSPR 
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SEQID 
NO: 


Method 


Predicted 
beginning 
; nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCystetae, D=Aspartic Acid, 
E=Glutamle Add, ^Phenylalanine, G=Glycine, H«Hlstidine, 
I=l5oleucine, K«Lysine, L=Leudue, M=Methionine, 
N»Asparaginc, P^Prollne, Q=€lutamine, R=Arginlne, S^Serlne, 
T-Threoninc, V-Valine, W=Tryptophan, Y«Tyroslne, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 




♦ 






SCHRHERSHMGEKAYQCKECGKAFMCPRYVRR 

HERTHSRKKLYECKQCGKALSSLTSFQTHIRMHS 

GERPYECKTCGKGFYSAKSFQRHEKTHSGEKPY 

KCXQCGKAFTRSG SFRYHERTHTGEKPYECKQC 

GKAFRjSAPNLQSHGRTHTGEKPYECKECGKAFIF 

VNNLQSHERTQTH1RIHSGERRYKCKJCGKGFYC 

HERTHTGEKPYKCEQCGKAPRA VSIL* MHGRTH 

PEEKPYECTQ*RKAFRSAPHL*IRGRTHNGEKPY 

ACKKCGKPFGSAQNLRIHERTQTrnMHSVERPYK 

CKICGRGFYSAKSFQTHEKSYTGEKPYECKQCG 

KAFVSFTSFRYHERTHTGENPYECKQFGKAFRSV 

KNIJUTCKRTHTGEKPCEYMK^ 

NVAKLSLLPVLFMMKEFTLGRNPISVSNVRKPLF 

LPLIJMMKGLTWERNPMSVCHVGKPSFLLVPFN 

IMKGLTLERSPMNISNVGKPSDQPRTFKCMEGLT 

LEKNPMNVSSMGKRSDLTRFFEYR 


3407 


A 


1426 


3 


PAAPSGASPGRVCGVETARPLGVQRRQSADEGP 

PGVAGLRHEPPTVWLGSVAHRGTWVCAHRWFG 

PAVTRAAQAATMVKLLVAKILCMVGV>FFMLL 

GSIXPVXIIETDFEKAHRSKKJLSLCNTFGGGVFL 

ATC\LTALLARC*GKSSRRSWSLGH1 STDYPIAAE 

TILLLGFFMTVFLEQLELTFAQENAVLHRPGDLQR 

RIGRGQRLGV* EPLHGGRAGPRA VRG APRPRPQP 

ERAGPLAVPSPVRLLSLAFALSAHSVFEGLALGLQ 

EEGEKWSLFVGVAVHETLVPVALGISMAGSAM 

PLRDAAKLAVTVSPMIPLGIGLGLGIEKAQGVPG 

SVASVLLQGPGGRHLSLFITFPGKSWPRSWRKKS 

DRLLKVLF\LWGYTVLAGMGLPQVVSGLATVPA 

AGSPPGAPGRTQAASPGRASPKSEHCGPGPPPVH 

KGPPGTRLCPRSYTLSLRALLLFKJLLSLKSLYQK 

KK 


3408 

• 


A 


106 


4514 


EARDRLAQSRAKEKELNSVASELSARQEESEHSH 

KHLffiLRREFKKNVPEEIREMVAPVLKSFQAEVV 

ALSKRSQEAEAAFLSVYKQLIEAPALWELKLKSR 

PALGDSRVQQGQHDPKTDNQNTQQKAGFKEGW 

LAEASEREAFGPGFKDPVPVFEAARSLDDRLQPP 

SFDPSGQPRRDLHTSWKRNPELLSPKALKATQAE 

Ll^LRRKYBEEAASKADEVGLIMTM-EKANQRA 

EAAQREVESLREQLA S VNSSIRLACCSPQGPSGD 

KVNK1 LCSGPRLEAALASKDREILRLLKDVQHLQ 

SSLQELEEASAKQIADLERQLTAKSEAIEKLEEKL 

QAQSDYEEDCTELSILKAMKLASSTCSLPQGMAK 

PEDSLLIAKEAFFPTQKFLLEKPSLLASPEEDPSED 

DSIKDSLGTEQSYPSPQQLPPPPGPEDPLSPSPGQP 

LLGPSLGPDGTRTFSLSPFPSLASGERLMMPPAAF ■ 

KGEAGGIXVFPPAFYGAKPPTAPATPAPGPEPLG 

GPEPADGGGGGAAGPGAEEEQLDTAEIAFQVKE 

QLLKHNIGQRVFGHYVLGLSQGSVSEILARPKPX 

WRKLHG* *GKEPFIKMKQFLSDEQNVL ALR17QV 

RQRGSITPRIRTPETGSDDAIKSILEQAKKEIESQK 

GGEPKTSVAPLSIANGTTPASTSEDADCSILEQAR 

REMQAQQQALLEMEVAPRGRSVPPSPPERPSLAT 

ASQNGAPAL VKQEEG SGGPAQAPLP VLSPAAFV 

QSHRKVKSEIGDAGYFDHHWASDRGLLSRPYAS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D-Aspartic Acid, 
E=Glutomk Acid, F=Phenylalanine, G=G!yclne, KHHistidine, 
I^Isoleucine, K»Lysioe, LHLeudne, M=Methionine, 
N»Asparagine ff P=*ProUne, Q=€Jutaraine, RpArginine, S=Serine, 
T=Threonine, V=*Vallne, W^Tryptophan, Y=TyrosIne, 
X^Un known, *=Stop codon, /=possible nucleotide deletion, 
Vspossible nucleotide insertion 










VSPSLSSSSSSGYSGQPNGRAWPRGDEAPVPPED 

EAAAGAEDEPPRTGELKAEGATAEAGARJLPYYP 

AYVPRTLKPTVPPLTPEQYELYMYREVDTLELTR 

QVKEKLAKNGICQRIFGEKVLGLSQGSVSDMLSR 

PKPWSKLTQKGREPFIRMQLWLSDQLGQAVGQQ 

PGASQASPTEPRSSPSPPPSPTEPEKSSQEPLSLSLE 

SSKENQQPEGRSSSSLSGKMYSGSQAPGGIQETV 

AMSPELDTYSITKRVKEVLTDNNLGQRLFGESIL 

GLTQGSVSDLLSRPKPWHKLSLKGREPFVRMQL 

WLNDPHNVEKIJUDMKKLEKKAYLKRRYGLIST 

GSDSESPATRSECPSPCLQPQDLSLLQIKKPRWL 

APEEKEALRKAYQLEPYPSQQTIELLSFQLNLKT 

NTNONWFHNYRSRMRREMLVEGTQDEPDLDPSG 

GPGILPPGHSHPDPTPQSPDSETEDQKPTVKELEL 

QEGPEENSTPLTTQDKAQVRIKQEQMEEDAEEE 

AGSQPQD SGELDKGQGPPKEEHPDPPGNDGLPK 

VAPGPLLPGGSTPDCPSLHPQQESEAGERLHPDP 

LSFKSASESSRCSLEVSLNSPSAASSPGLMMSVSP 1 

VPSSSAPISPSPPGAPPAKVPSASPTADMAGALHP 

SAKVNPNLQRRHEKMANLNNITYRJLERAANREE 

ALEWEF 


3409 


A 

* 

• 


162 


1710 


GPLSPGPYQCRPSLPAQLYPQSLMAAAH,RTPTQ 

GTVTFEDVAVHFSWEEWGLLDEAQRCLYRDVM 

LENLALLTSLDVHHQKQHLGEKHFISNVGRALF 

VKTCTFHVSGEPSTCREVGKDFLAKLGFLHQQA 

ArTTGEQSNSKSDGGAISHRGKTHYNWGEHTKAF 

SGKHTLVQQQRTLTTERCYICSECGKSFSKSYSL 

NDHWRLHTGEKPYECRECGKSFRQSSSLIQHPR 

GHTAVRPHECDECGKLFSNKSNLIKHRRVHTGE 

RPYECSECGKSFNQRSALLQHRGVHTGEKPYEC 

TECGKSFSHNSSLDCHQRIHSG*\RPYECTECGKSF 

SQNSSLIEHHRVHTGERPYKCSECGKSFRQRSAL 

LQHRGVPTGERPYECSECGKFFPYSSSLGKHQRV 

HTGSRPYECSECGKSFTQNSGLIKHRRVHTGEKP 

YECTE*KKSFSHNSSLDCHQRIHSR*KPYE\CKCG 

N\R*HPGESP*VHSECQ/KSFS*RPYLIECHTVHKG 

KTLLICRDVQLI 


3410 


A 


167 


789 


LCMKGISGGVRVAALAARAEREELPVPAMEPQP 

TAWGSPHPEAVLQLEVAPESSGPCTDTAKDQQS 

DKLPDLMPPA\EPLGSALELRASLEIDVAE\RGCE 

HGPSQQLPRCP*SWAWSEPWCQRPGCAV*APLP 

Y*REASFIYQSHSPAASGPFHSAGAGAVYLQAGG 

V/GEQEKEAVRKGSGSSSCSQRGPVPPPGMEVCPL 

LGFWAICP 


3411 


A 


1040 


887 


ASLSKPAGISTMPWALILLFLLTHSAVS WQAGL 
TQPPSVSKDLRVQTATLTCTGNSNNVGHQGVIWL 
QQHQGHPPKLLSYRNNNRPSGISERLSAYKSGNA 
ASLTIYGLQTEHEAD * * CRPRRKLIPKTARLFFFFL 
EDNEEYLLRVY 


3412 


A 


164 


83 


RRGIPGSASLSLTMCVRSCFQSPRLQWVWRTAFL 

KHTTQRRHQGSHRWTHLGGSTYRAV1FDMGGVLI 

PSPGRVAAEWEVQNRIPSGTILKALMEGGENGP 

WMRFMRAEITAEGFLREFGRLCSEMLKTSVPVD 

SFFSLLTSERVAKQFPVMTEAITQIRAKGLQTAVL 

SNNFYLPNQKSFLPLDRKQFDVIVESCMEGICKP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= Ala nine OCysteine, D-Aspartic Acid, 
E=GlDtamic Acid, F=Pheny (alanine, OGIycine, H-Histidine, 
l=Isoleuclne, K?Lyslne, L^Leucine, M-iMethionlne, 
N^Asparagine, P=Prollne, Q=GIutarainc, R»Arginine, S^Serinc, 
T=Threonine, V=Vatine, W=Tryptophan, Y-Tyrosine, 
X=»Unknown, •=Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 










DPRIYKI>CLEQLGLQPSESIFLDDLGTNUCEAARL 

GIHTIKVNDPETAVKELEALLGFTLRVGVPNTRP 

VKKTMEIPKDSLQKYLKDLLGIQTTGPLELLQFD 

HGQSNPTYYlRLANia)LVLRKKPPGTLLPSAHAI 

EREFIUMKALANAGVPVPNVLDLCEDSSVIGTPF 

YVMEYCPGLIYKDPSLPGLEPSHKRAIYTAMNTV 

LCKMSVDLQAVGLEDYGKQGSTTWV/YSSRKA 

RGALLFLDWELSYPWGDPFADVGYSCLAHYLPS 

SFPVLRGINDCDLTQLGDPAAEEYFRMY CLQMGL 

PPTENWNFYMAFSFFRVAADLQGVYKRSLTGQA 

SSTYAEQTGKLTEFVSNLAWDFAVKEGFRVFKE 

MPFTNPLTRS YHTWARPQSQ WCPTGSRS YS S VPE 

ASPAHTSRGGLVISPESLSPPVRELYHRLKHFME 

QRVYPAEPELQSHQASAARWSPSPLIEDLKVKQP 

W * GGRS GRTS WRJLLALGCHT 


3413 


A 


105 

« 


1573 


PESRHQCFSDRSSHFLTMEMEQEKMTMNKELSP 

DAAAYCCSACHGDETWSYNHPERGRAKSRSLSA 

SPAXGSTKEFRRTRSLHGPCPVTTFGPKACVLQN 

PQTIMHIQDPASQEILTWNKSPKSVLVIKKMRDAS 

LLOPFKELCTHLMEENMTNTYVEKKVLEDPAIASD 

ESFGAVKKKFCTFREDYDDISNQIDFIICLGGDGT 

LLYASSLFQGSVPPVMAFHLGSLGFLTPFSFENFQ 

SQ VTQ VIEGN AA WL/RG SRLK VRWKELRGKK 

TAVHNGLGEKGSQAAGLDMDVGKQAMQYQVL 

NEVVIDRGPSSYLSNVDVYLDGHLrri" VQGD/G * 

GPQHLSWGP*AFLGRE*RLRLSLSGVIVSTPTGST 

AYAAAAGASM1HPNVPAIMITPICPHSLSFRPIVV 

PAGVELKIMLSPEARNTAWVSFDGRKRQEIRHG 

DSISITTSCYPLPSICVRDPVSDWFESLAQCLHWN 

VRKKQAHFEEEEEEEEEG 


3414 


A 


20 


2602 


VIVNKNVNWINYIYYNQQQRAFHELKEKLMSAL 

ALGLPDLTKPFTFYESEREKMAVGVLTQTVGPW 

PRPVAYLSKQLDGVSKGWPPCLRALAATAIXAQ 

EADKLTLGQNLNIKAPHAVVTLMNTKGHHWLT 

NARLTKYQSLPCEWHITIEVCNTLNPTTl^LPVSE 

SPGEHNCVEVLDSVYSSRPDLRDQPWASSVDWE 

LYMDGSSFINSQGERCAGYANTVTLDAVIKAKLW 

LQGTSAQKAELIALTRAVELSEGQESLEELLGRY 

FYVSHLPAFAKAVAQLCITCRQHNARQSPTVSPH 

IQAYGAAPFEDLQVDFTEMPKCGGNKYLLVLTC 

WSGWVEAYPTRTEKAYEVTRVLLRDLEPRFGLP 

LRIGSHNGPVFVADLDCVE1NVDTGVIWATWIKN 

EKDPVQLQKGKSGPSCTKGQCNPT <FJ AOTNPLDP 

RWKKGERVTLGINGAGLNPRVNILVRGEVYKCS 

LEPVFQTFYDELNVPITEFPGKTRNLFLQLAEHV 

AQSLTVTSCYVCGGTVIADQWPWEARELVPTDP 

VPDEFP AQKNHPDNF WVLKA SIIRQ YYIARVEKD 

FT1J > VGRLHGG/RSNHTEKNPFSKFPKLQTV*AHP 

ESHRDWTAPTGLYWICGHRAYTKLP\ASSCVIGTI 

JCPSFFLLSIKTGELLGFPVYASR\KSIAIRN*NNDK 

WPPERHQYYGPAT*AQIX5SWGYRIPIYMINRIIRL 

QAVLKIITATGRALTILAQQETQMRNAry QNRLA 

LDYLLAAEGEVCRKFNLTNCCLHIDNQGQVVED 

rVRDMTKVAHVPVQVWHGFDPGAMFRKWFPAL 

GGFKTLIIRVirVlGTYLLLPRLl^VIXQMIKSFlAT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A^AlanincOCysteine, D-Aspartic Acid, 
E«Glutamic Add, ^Phenylalanine, G=Glydne, H-Histidine, 
I«Isoleudne, K=Lysine, L=L*ucine, {^Methionine, 
N=Asparaglne, P=Proline, Q=G lata mine, R»Arginine, SnSerine, 
T»Threonine, V=Valine, W«Tryptopban, Y=*Tyrosine, 
X»Unknown, *=*Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










LVYQNASAQVYYINHY 


3415 


A 


455 


108 


NMSWRGRSTYRPRPRRSLQPPELIQAMLEPTDEE 
PKEEKPPTFCSRNPTPDQKREDDSG/S AA* DFKWP 
EPGKPIFQGAMVRPKTGG/CGCEGGY*CQGEDS\P 
KAEHFKMPEAGEGKSQV 


3416 


A 


1 


874 


FFFFQRINFIEHSGS VSLLALACDLG WCED WS CC 

LVQGGGDLVDWQTNHGEDEAGGDTDSVDEAR 

CKESQQEAQENLREDLCLESFAKDKILQIIEGSER 

EHEETRTKQAALDGEPLGGGQLTAVHLHPSKEQ 

QGQEG GERQRGARTHHWRG WEKGRRVRLRPPS 

GKLRADQPVRKLGGPTPS/TELPGLQPHAPTPHT 

A/PA' 1 V 1 YSPAPDTPNPPVRWKCPLPVEPRTRQLC 

RERTRKACPPKPRPPLGLPGDPTGPVTHHAPPVS 

PTGASGQERRAEPGAVSYAHASATK 


3417 


A 


243 


847 


CLKYMYTYIFCPNCVSYKMKTDHFSLRYLHSSC 

AEDNKSSVDSSGQAAHPSKGKFFPHGTHWGTQC 

RGHISVLGWQCSCPSTGCRVGLGLAMCQTHAYI 

HTHTHTHTHTPTDYGAHHTDPLQRWGLGPRVKS 

EAGPJLPQLSRDQSHPGPLSPGASPRSAGLPGWHP 

AHQEPRARGRCARDGLSLQTRLTNKYDIQCCQE 

MRK 


3418 


A 


4073 

a 


1000 

* 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEEK 

AAKITELINKJLNFLDEAEKDLATVNSNPFDDPDA 

AELNPFGDPDSEEPITETASPRKTEDSFYNMSYNP 

FTCEVQTPQYLNPFDEPEAFVTIKDSPPQSTKRKNI 

RPVDMSKYLYADSSKTEEEELDESNPFYEPKSTP 

PPhmLVWVQELETERRVKRKAPAPPVLSPKTGV 

LNENTVSAGKDLSTSPKPSPIPSPVLGRKPNASQS 

LLVWCKEVTKNYRG\OCITNFTTSWRNGLSFCAI 

LHHFRPDLIDYKSLNPQDIKENNKKAYDGFASIGI 

SRLLEPSDMVLLAIPDKLTVMTYLYQIRAHFSGQ 

ELNWQIEENSSKSTYKVGNYETDTNSSVDQEKF 

YAELSDLKREPELQQPISGAVDFLSQDDSVFVND 

SGVGESESEHQTPDDHLSPSTASPYCRRTKSDTEP 

QKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKK 

RLLKAETLELSDLYVSDKKKDMSPPFICEETDEQ 

KLQTLDIGSNLEKEKLENSRSLECRSDPESPUCKT 

SLSPTSKLGYSYSRDLDLAKKKHASLRQTBSDPD 

ADRTTLNHADHSSKIVQHRLLSRQEELKERARVL 

LEQ ARRD AALKAGNKHNTNTATPFCNRQL SDQ 

QDEERRRQLRERARQLIAEARSGVKMSELPSYGE 

MAAEKUCERSKASGDENDNIEIDTNEEIPEGFVV 

GGGDELTNLENDLDTPEQNSKLVDLKLKKLLEV 

QPQVANSPSSAAQKAVTESSEQDMKSGTEDLRT 

ERLQKTTERFRNPVVFSKDSTVRKTQLQSFSQYI 

ENRPEMKRQRSIQEDTKKGNEJEKAAITETQRKPS 

EDEVLNKGFKDSXSQYVVGELAALENEQKQIDTR 

AALVEKRLRYLMDTGRNTEEEEAMMQEWFML 

VNKK^ALIRRMNQLSLLEKEHDLERRYELLNRE 

LRAMLAJEDWQKTEAQKRREQLLLDELVALVN 

KRDALVRDLDAQEKQAEEEDEHLERTLEQNKG 

KMAKKEEKCVLQ 


3419 


A 


4073 


1000 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEEK 
AAKITELINKLNI^EAEKDIJVTVNSNPFDDPDA 
AELNPFGDPDSEEPITETASPRKTEDSFYNNSYNP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D»Aspartic Acid, 
E=GIutamic Add, ^Phenylalanine, G=Gtyclnc, H-Histidine, 
I=Isoleucine, K s Lysine, L=Leucine, M=Methioninc» 
N=Asparaginc, P=Proline, Q=Glutamine, R^Arginine, S=Serinc, 
TsThreonine, V«Valine, W«Tryptophan, V«Tyrosine, 
X«Unknown, *<=5top cod on, /=posslblc nucleotide deletion, 
V=possible nucleotide insertion 






• 




FKEVQTPQYLNPFDEPEAFVTEKDSPPQSTKRKNI 

RPVDMSKYLYADSSKTEEEELDESNPFYEPKSTP 

PPNNLVNPVQELETERRVKRKAPAPPVLSPKTGV 

LNENTVSAGKDLSTSPKPSPIPSPVLGRKPNASQS 

LLVWCKEVTKNYRGVK1TNFTTSWRNGLSFCAI 

LHHFRPDLIDYKSLNPQDIKENNKKAYDGFASIGI 

SRLLEPSDMVLLA1PDKLTVMTYLYQIRAHFSGQ 

ELNWQIEENSSKSTYKVGNYETDTNSSVDQEKF 

YAELSDLKREPELQQPISGAVDFLSQDDSVFVND 

SG VGESESEHQTPDDHLSPSTA SPYCRRTKSDTEP 

QKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKK 

RLLKAETLELSDLYVSDKKKJDMSPPFICEETDEQ 

KLQTLDIGSNLEKEKLENSRSLECRSDPESPUCKT 

SLSPTSKLGYSYSRDLDLAKKKHASLRQTESDPD 

ADRTTLNHADHSSKJVQHRLLSRQEELKERARVL 

LEQARRDAALKAGNKHNTNTATPFCNRQLSDQ 

QDEERRRQLRERARQLIAEARSGVKMSELPSYGE 

MAAEKLKERSKASGDENDNIEIDTNEEIPEGFVV 

GGGDELTNLENDLDTPEQNSKLVDLKLKKLLEV 

QPQVANSPSSAAQKAVTESSEQDMKSGTEDLRT 

ERLQKTTERFRNPVVFSKDSTVRKTQLQSFSQY1 

ENRPEMKRQRSIQEDTKKGNEEKAA1TETQRJECPS 

EDEVLNKGFKDSXSQYVVGELAALENEQKQIDTR 

AALVEKRLRYLMDTGRNTEEEEAMMQEWFML 

VNKKNAURRMNQLSLLEKEHDLERRYELLNRE 

LRAMLAJDEDWQKTEAQKRREQLLLDELVALVN 

KRDALVRDLDAQEKQAEEEDEHLERTLEQNKG 

KMAKKEEKCVLQ 


3420 


A 


612 


1058 


ENLGPN YSHRLLHHPTFYKXIHKJCHHE WTAPIG 

VISLYAHPIEHAVSNMLPVIVGPLVMGSHLSSITM 

WFSLALnTTISHCGYHLPFLPSPEFHDYHHLKFN 

QCYGVLGVLDHLHGTDTMFKQTKAYERHVLLL 

GFTPLSESIPDSPK 


3421 


A 


23 


2005 


LLTPCDGRJPGRPSVGAESGSDFQQRRRRRRDPE 

EPEKTELSERELAVAVAVSQENDEENEERWVGP 

1J>VEATIJVKKRKVLEFERVYLDNLPSASMYERS 

YNfHRDVITHWCTKTDFIITASHDGHVKFWKKIE 

EG1EFVKHFRSHLGVIESIAVSSEGALFCSVGDDK 

AMKVFDVVWDMINMLKLGYFPGQCEWIYCPG 

DAISSVAASEKSTGKIFIYDGRGDNQPLHIFDKLH 

TSPLTQIRLNP V YKA WS SDKSGMIE YWTGPPHE 

YKFPKNVNWEYKTDTDLYEFAKCKAYPTSVCFS 

PDGKKIATIGSDRKVRIFRFVTGKLMRVFDESLS 

MF1BLQQMRQQLPDMEFGRRMAVERELEKVDA 

VRLINIWDETGHFVLYGTMLGIK\nKVET^ 

RILGKQENIRVMQLALFQGIAKKHRAATTffiMKA 

SENPVLQMQADr^VCTSFKKNRFYMFTKREPE 

DTKSADSDRDVFNEKPSKEEVMAATQAEGPKRV 

SDSAimTSMGDMTKUTVECPKTVENFCVHSRN 

GYYNGHTFHRUKGFMIQTGDPTGTGMGGESIWG 

GEFEDEFHSTLRHDRPYTLSMANAGSNTNGSQFF 

ITVVPTPWLD>nam^GRVTKGMEVVQRIShAVK 

VNPKTDKPYEDVSIIN1TVK 


3422 


A 


2486 


433 


FVLVCAPLTWAGARHRRMAASKKPPRVRVhIHQ 
DFQLRNLRJIEPNEVTHSGDTGVE'llXjRMPPKVT 
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SEQID 
NO: 


| Metbod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

odd residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«-Alanine C-Cysteinc, D=- As par tic Acid, 
E*=GIutamic Add, F«Pfaenylatanine, G=Glycine, H-Histidine, 
Msoleucine, K«Lysine, L=Leuclne, M°Methioninc, 
N=Asparagine, P=ProIine, Q^GIutamioe, R=Arginine, S=Serine, 
T»Threonine, V«Vallne, W«=Tryptophan f Y=Tyroslne, 
X=Unknown, **=Stop codon, /=possib!e nudeotide deletion, 
\=posslble nudeotide insertion 










SELLRQLRQAMKNSEYVTEPIQAY1IPSGDAHQSE 

YIAPCDCRRAFVSGFDGSAGTAUTEEHAAMWTD 

GRYFLQA AKQMD SNWTLMKMGLKDTPTQED W 

LVSVLPEGSRVGVDPLllP'l'DYWKKMAKVLRSA 

GHHLIPVKENLVDKIWTDRPERPCKPLLTLGLDY 

TGISWKDKVADLRLKMAERNVMWFVVTALDEI 

AWLFhn^RGSDVEHNPVFFSYAJIGLETIMLFIDGD 

RTOAPSVKEHLLLDLGLEAEYRIQVHPYKSILSEL 

KALCADLSPREKVWVSDKASYAVSETIPKDHRC 

CMPYTPICIAKA\VKNSA\ESEGMRRAHIKDAVAL 

CELFNWLEKEVPKGGVTEISAADKAEEFRRQQA 

DFVT)LSFPTISSTGPNGAIIHYAPVPETNRTLSLDE 

VYLmSGAQYKDGTTDVTOTMHFGTPTAYEKJEC 

FTYVLKGHIAVSAAVFPTGTKGHLLDSFARSAL 

WDSGLDYLHGTGHGVGSFLNVHEGPCGISYKTF 

SDEPLEAGKGVTDEPGYYEDGAFGIRIENVVLVV 

PVKTKYNFNNRGSLTFEPLTLWIQTKMIDVDSL 

TDKECDWLNNYHLTCRDVIGKELQKQGRQEAL 

EWLIRETQPISKQH 


3423 

► 


A 

• 


5515 


934 

4 


FKMPENPATDKLQVLQVXDRLKMKLQEKGDTS 

QNEKLSMFYETLKSPLFNQILTLQQSIKQLKGQL 

NHIPSDCSANFDFSRKGLLVFTDGSITNGNVHRPS 

NNSTVSGLFPWTPKLGMEDFNSVIQQMAQGRQIE 

YIDffiRPSTGGLGFSWALRSQNLGKVDIFVKDV 

QPGSVADRDQRLKENDQILAINHTPLDQNISHQQ 

AIALLQQTTGSLRLIVAREP VHTKSSTSS SLNDTT 

LPETVCWGHVEEVELINDGSGLGFGIVGGKTSGV 

WRTIVPGGLADRDGRLQTGDHILKIGGTNVQG 

MTSEQVAQVLRNCGNSVRMLVARDPAGDISVTP 

PAPAALPVALPTVASKGPGSDSSLFETYNVELVR 

KDGQSLGIRIVGYVGTSHTGEASGIYVKSIIPGSA 

AYHNGHIQVNDKIVAVDGVNIQGFANHDWEVL 

RNAGQWHLTLVRRKTSSSTSPLEPPSDRGTWE 

PLKPPALFLTGAVETETNVDGEDEEIKERIDTLKN 

DNIQALEKLEKVPDSPENELKSRWENLLGPDYEV 

MVATLDTQIADDAELQKYSKLLPIHTLRLGVEV 

DSFDGHHYISSIVSGGPVDTLGLLQPEDELLEVN 

GMQLYGKSRREAVSFLKEVPPPFTLVCCRRLFDD 

EASVDEPRRTETSLPETEVDHNMDVNTEEDDDG 

ELALWSPEVKIVELVKDCKGLGFSILDYQDPLDP 

TRSVIVIRSLVADGVAERSGGLLPGDRLVSVNEY 

CLDNTSLAEAVEELKAVPPGLVHLGICKPLVEDN 

EEESCYILHSS SNEDKTEFSGTIHDINSSLDLEAPK 

GFRDEPYFKEELVDEPFLDLGKSFHSQQKEIEQS 

KEAWEMHEFLTPRLQEMDEEREMLVDEEYELY 

QDPSPSMELYPLSHIQEATPVPSVNELHFGTQWL 

HDNEPSESQEARTGRTVYSQEAQPYGYCPENVM 

KENFVMESLPS VPSTEGN S QQGRFDDLENLNS LA 

KTSIJOLGMIPNDVQGPSLLIDLPVVAQRREQEDL 

PLYQHQATRVISKASAYTGMLSSRYATDTCELPE 

REEGEGEETPNFSHWGPPRIVEIFREPNVSLGISIV 

GGQTVDCRLKNGEELKGIFIKQVLEDSPAGKTNA 

LKTGDKILEVSGVDLQNASHSEAVEAIKNAGNP 

VWWQSLSSTPRVIPNVHNKANKITGNQNQDTQ 

EKKEKRQGTAPPPMKLPPPYKALTDDSDENEEE 1 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

sequence 


Predicted end 

nucleotide 

location 

corresponding 

to last amino 

acid residue of 

peptide 


Amino acid sequence (A«Alanine OCysteine, D»Aspartic Acid, 
E^GIutamic Acid, F-Phenyl alanine, G=Glycine, H=Hjstidine t • 
I«l5o!eucine, K^Lysine, l^Leucine, M-Methlooine, 
N=Asparagine, P^Proiine, Q=Clutamine, R-Argioint, S=S trine, 
T»TbreonJne, V«Valine, W=Tryptophan, Y^yrosinc, 
X=Unknown, *=S£op codon, /^possible nucleotide deletion, 
V=po5sible nucleotide insertion 


■ • 








DAFTDQKIRQRYADLPGELHIIELEKDKNGLGLS 

LAGNKDRSRMSEFVVGINPEGPAAADGRMHIGD ! 

EIXEINNQILYGRSHQNXASAIDCTAPSKVKLVFIR 

NEDAVNQMAVTPFPVPSSSPSSIEDQSGTEPISSEE 

VDGSLEWGIKQLPESESFKLAVSQMKQQKYPTKV 

SFSSQEBPLAPASSYHSTDADFTGYGGFQAPLSVD 

PATCPIVPGQEMIIE1SKRRSGLGLSIVGGKDTPLV 

NGVDLRNSSHEEAITALRQTPQKVRLVVYRDEA 

HYRDEENLEEFPVDLQKKAGRGLGLSIVGKR 


3424 


A 




1162 


HASERVVOLPDFVWDOYTHSLGRVEREFKNRKR 

HTRRVKLVFDKGLPARPKSPLDPKKDGESLSYS 

MLPLSDGPEGSSSRPQMIRGRLCDDTKPETFNQL 

WTVEEQKKLEQLLIKYPPEEVESRRWQKIADELG 

NRTAXQVA SRVQK YFIKLTKAGIP VPGRTPNL YI 

YSKKSSTSRRQHPLNKHLFKPXGTFMTSHEPPVY 

MDEDDDRSCFHSHMNTAVEDASDDESIPIMYRN 

LPEYKELLQFKKIJCKQKLQHMQAESGFVQHVGF 

KCDNCGIEPIQG\VRW\HCR\DCPP\EMSL\DFC\DS 

aSDCLHETVDIHKGDHQLEPIYRSVETFLDRDYCV 

SQGTSYNYLDPNYFPANR 


3425 


A 

• 


2223 


1162 


HASERVVOLPDFVWDOYTHSLGRVEREFKNRKR 

HTRRVKLVFDKGLPARPKSPLDPKKDGESLSYS 

MLPLSDGPEG SSSRPQMIRGRLCDDTKPETFNQL 

WTVEEQKKLEQLLIKYPPEEVESRRWQKIADELG 

NRTAKQVASRVQKYFIKLTKAGIPVPGRTPNLYI 

YSKKSSTSRRQHPLNKHLFKPVGTFMTSHEPPVY 

MDEDDDRSCFHSHMNTAVEDASDDES1PIMYRN 

LPEYKELLQFKKLKKQKLQHMQAESGFVQHVGF 

KCDNCGIEPIQG\VRW\HCRVDCPP\EMSL\DFC\DS 

aSDCLHE'HDIHKGDHQLEPIYRSVE'EFLDRDYCV 

SQGTSYNYLDPNYFPANR 


3426 


A 


2 


1553 


LFVWHDDPRWGTPRYWLGALYRNQQSSPTAPP 

G1XPLEYIVAAPHCSHSRQWRCSQTHRIHHHPQ 

MLGPCRQEICGITMAAGTLYTYPENWRAFKALI 

AAQYSGAQVRVLSAPPHFHFGQTNRTPEFLRKFP 

AGKVPAFEGDDGFCVFESNAIAYYVSNEELRGST 

PEAAAQWQWVSFADSDIVPPASTWVFPTLGIM 

HHNKQATENAKEEVRRILGLLDAYLKTR'rFLVG 

ERVTLADITWCTLLWLYKQVLEPSFRQAFPNTN 

RWFLTCINQPQFRA\WGEVKLCEKMAQF\DAKK 

FAETQPKKDTPRKEKGSREEKQKPQAERKEEKK 

AAAPAPEEEMDECEQALAAEPKAKDPFAHLPKS 

TFVLDEFKRKYSNEDTLSVALPYFWEHFDKDGW 

SLWYSEYRFPEELTQTFMSCNLITGMFQRLDKLR 

KNAFAS VILFGTNNSS SISGVWVFRGQELAFPLSP 

DWQVDYESYTWRKLDPGSEETQTLVREYFSWE I 

G AFQHVGKAFNQ GKJFK 


3427 


A 


755 


52 


TAARRRQKGTAARRRQKGTAARRRQKGTAARR 

RQKGTAARRRQKGTAARRRQKGTAARRRQKGT 

AARRRQKGTAARRRQKGTAARRRQKGTAARRR 

QKGLSNLDAAEWLPPKKGXGEKKKGPFLAINEV 

VTNREYPIhnLKRIHGVGFKKRAPRALKEIRKFAM 

KBMGTPDVRIDTRLNKAVWAKGIRNVPYRIRVR 

LSRKRNEDEDSPNKLYTLVTYVPVT1VKNLQTV 

NVDEN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanlne OCysteine, D=>Aspartic Acid, 
OGlutamic Add, ^Phenylalanine, &=Glycine, H=HLs tiding 
I=IsoIeutine, K=Lyslne, L=Leucine, M=Methionlne, 
N=Asparogine, P^Proline, Q=Glutaraine, R=Arginine, S=Serine, 
^Threonine, V»Valine, W=Tryptophan, Y=Tyrosine, 
X°Un known, *«=Stop codon, /^possible nucleotide deletion, 
V*possible nucleotide insertion 


3428 


A 


4 


1939 


LPLSLSFSEMPLPLLPMDLKGEPGPPGKPGPWGP 

PGPPGFPGKPGHGKPGLHGQPGPAGPPGFSKMG 

KAGPPGLPGNVGPPGQPGLRGEPGIRGDQGLRGP 

PGPPGLPGPSGITIPGKPGAQGVPGPPGFQGEPGP 

QGEPGPPGDRGLKGDNGVGQPGLPGAPGQGGAP 

GPPGLPGPAGLGKPGLDGLPGAPGDKGESGPPG 

VPGPRGEPGAVGPKGPPGVDGVGVPGAAGLPGP 

QGPSGAKGEPGTRGPPGLIGPTGYGMPGLPGPKG 

DRGPAGVPGLLGDRGEPGEDGEPGEQGPQGLGG 

RGDQGPSGLAGKPGVPGERGLPGAHGPPGPTGP 

KGEPGFTGRPGGPGVAGALGQKGDLGLPGQPGL 

RGPSGIPGLQGPAGPIGPQGLPGLKGEPGLPGPPG 

EGRAGEPGTAGP\RGPPGVPGSPGITGPPG\LPGPP 

GAPGAFDETG1AGLHLPNGGVEGAVLGKGGKPQ 

FGLGELSAHATPAFTAVLTSPLPASGMPVKFDRT 

LYNGHSGYNPATGEFTCPVGGVYYFAYHVHVKG 

ThTVWVALYKNNN^ATYTYDEYKKGYLDQASG 

GAVLOLRPNDOVWVOMPSDOANGLYSTEYTHSS 

FSGFLLCPT 


3429 


A 


212 


1075 


EGLTGPCERVPFLLGRGPPHGATRAGHRRAVRW 

AGPESLPPLPRSLIMDSPRAGTHQGPLDAETEVG 

ADRCTSTAYQEQRPQVEQVGKQAPLSPGLPAMG 

GPGPGPCEDPAGAGGAGAGGSEPLVTVTVQCAF 

TVALRARRGADLSSLRALLGQALPHQXAQLGQLS 

YLAPGEDGHWVPIPEEESLQRAWQDAAACPRGL 

QLQCRGAGGRPVLYQVVAQHSYSAQGPEDLGF 

RQGDTVDVLCEVDQAWLEGHCDGRIGIFPKCFV 

VPAGPRMSGAPGRLPRSQQGDQP 


3430 


A 


799 


1989 


INKYINIRKJOKLLSPLPPLWSHLALLQASATKWV 

LTP AAFAGKLLS VFROPLSSLWRSLVPLFC WLRA 

TFWLLATKRRKQQLVLRGPDETKEEEEDPPLPTT 

PTSVNYHFTRQCNYKCGFCFHTAKTSFVLPLEEA 

KRGLLLLK\EAG\LEKINFSGG\EPFLQDRGEYLGK 

LVRFCKVELRLPSVSI\VSNGSLIRERWFQNYG\E 

YLDILAISCDSFDEEVNCP\IGRGN\GKKNHVENL 

QKLVRRWCRDYRVPFKINSVINPFVNVEEDMTEQI 

KALNPVRWKVFQCLLIEGENCGEDA\LREAERFV 

IGDEEFERFLERHKEVSCLVPESNQKMKDSYLIL 

DEYMRFLNCRKGRKDPSKSILDVGVEEAIKFSGF 

DEKMFLKRG GKYIWSKADLKLDW 


3431 


A 

■ 


5468 


2146 


ACGFLPGRCHFSTFKQCQEWLSRLSRATARPAKP 

EDLFAFAYHAWCLGLTEEDQHTHQLCQPGEHIRC 

RQEAELARMGFDLQNVWRV SHINSNYKLCPS YP 

QKLLVPVWTTDKELENVASFRSWKRIPVVVYRH 

LRNGAAIARCSQPEISWWGWRNADDEYLVTSIA 

KACALDPGTRATGGSLSTGNNDTSEACDADFDS 

SLTACSGVESTAAPQKLLUJDARSYTAAVANRAK 

GGGC3CEEYYPNCEVWMGMANIHAIRNSFQYL 

RAVCSQMPDPSNWLSALESTKWLQHLSA^MLKA 

AVLVANTVDREGRPVLVHCSDGWDRTPQIVALA 

KILLDPYYRTLEGFQVLVESDWLDFGHKFGDRC 

GHQENVEDQNEQCPVFLQWLDSVHQLLKQFPCL 

FEFNEAFLVKLVQHTYSCLYGTFLANNPOEREK 

RhnYK/RGTCSVWALLRAGNKNFHNFLYTPSSD 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= Ala nine OCysteine, D=Aspartic Acid, 
£"Glotamic Add, F«Phenylalanine, G=Glytine, HNHistidine, 
I«Isoleucine> K«*Lysine, I>=Ltucine, M-Methionine, 
N— Asparaginc, P^ProIine, Q=GIutaminc, R^Arginine, SNSerine, 
T«ThreonIne, V^Valine, W=Tryptophan, Y»Tyroslne, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
V=possibIe nucleotide insertion 








• 


MVLHP VCHVRALHL WTA VYLPA SSPCTLGEEN 

MDLYLSPVAQSQEFSGRSLDRLPKTRSMDDLLS 

ACDTSSPLTRTSSDPNLNNHCQEVRVGLEPWHS 

NPEGSETSFVDSG VGGPQQTVGEVGLPPPLPS SQ 

KDYLSNKPFKSHKSCSPSYKLLNTAVPREMKSNT 

SDPEIKVLEETKGPAPDPSAQDELGRTLDGIGEPP 

EHCPETEAVSALSKV1SNKCDGVCNFPESSQNSPT 

GTPQQAQPDSMLGVPSKCVLDHSLSTVCNPPSA 

ACQTPLDPSTDF\LNQDPSGSVASISHQEQLSSVP 

DLTHGEED1GKRGNNRNGQLLENPRFGKMPLEL 

VRKPISQSQISEFSFLGSNWDSFQGMVTSFPSGEA 

TPRRLLSYGCCSKRPNSKQMRATGPCFGGQWAQ 

REGVKSPVCSSHSNGHCTGPGGKNQMWLSSHPK 

QVSSTKPVPLNCPSPVPPLYLDDDGLPFPTDVIQH 

RLRQIEAGYKQEVEQLRRQVRELQMRLDIRHCC 

APPAEPPMDYEDDFTCLKESDGSDTEDFGSDHSE 

IXJLSEASWEPVDKKETEVTRWVPDHMASHCYN 

CDCEFWLAKRRHHCRNCGNVFCAGCCHLKLPIP 

DQQLYDPVLVCNSCYEHIQVSRARELMSQQLKK 

PIATASS 


3432 

• 


A 

■ 


36 


1873 


MTFFSSVADFIGLDPRIAAWLIDPSDATPSFEDLV 

EKYCEKSITVKVNSTYGNSSRNIVNQNVRENLKT 

LYRLTMDLCSKLKDYGLWQLFRTLELPLIPILAV 

MESHAIQVNKEEMEKTSALLGARLKELEQEAHF 

VAGERFLITSNNQLREILFGKLKLHLLSQRNSLPR 

TGLQKYPSTVSEALNALRDLHPLPKIELEYRQVH 

KIKSTFVDGLLACMKKGS1SSTWNQTGTVTGRLS 

AKHPNIQGISKHPIQITTPKNFKGKEDKILTISPRA 

MFVSSKGHTFLAADFSQIELRILTHLSGDPELLKL 

FQESERDDVFSTLTSQWKDVPVEQVTHADREQT 

KKWYAWYGAGKERLAACLGVPIQEAAQFLES 

FLQKYKKIKDFARAAIAQCHQTGCVVSIMGRRR 

PLPRIHAHDQQLRAQAERQAVNFWQGSAADLC 

KLAMIHVFTAVAASHTLTARLVAQ1HDELLFEVE 

DPQIPECAALVRRTMESLEQVPLKVSLSAGRSWG 

HLVPLQEAWVALRQAHVALSLPATAWLPLGPLP 

APSPHPCIFRLHFVCSPRQQWEERTGFQQSIVWPS 

PRSPALYAPGRINPLGLGWPAIPWSKCLCKALKK 

K 


3433 


A 


1481 


476 


IPPKERAPGIRASCLA1TAG ARPTS YGRVG CEGD V 

RLSPVSPLLAPPDPRLASRWEGRSRMKGKKGIVA 

ASGSETEDEDSMDIPLDLSSSAGSGKRRRRGNLP j 

KESVQELRDWLYEHRYNAYPSEQEKALLSQQTH 

LSTLQVC^TWFINARRRLLPDMLRKDGKDPNQFTI 

SRRGAKISETSSVESVMGIKNFMPALEETPFHSFT\ 

AGPNPTLGNRPLSAKP/SQSPGSVLAJ^SVICHTTV 

TAIERLSLSLSCQSVGCGQNT\DIQQIAT\RNLRDS 

SLMYPEDTCKSGPSTNTQSGLFNTPPPTPPDLNQ 

DFSGFQLLVDVALKRAAEMELQAKLTA 


3434 


A 


1720 


1243 


NGPVPPGGSKTKWAGGSAAEGSPRLSPSPGAAQ 

VPALLRGEPRGGAAAGSFWKPLHQHSCGLRPPP/ 

PPD/RLSRLPGKTLSACDRENGARRPLLLGSTSFIP 

IGRRTYASAAEPVGSKAVLVTGCDSGFGFSLAKH 

LHSKGFLVFAGCLMKDKGHDGVKELDSLNSDRL 

RTVQLNVCSSEEVEKV/VGDCPLEPEGPXEKGMW 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanf ne C=Cysteine, D=Aspartic Add, 
E=Glatamic Acid, F=P be ny [alanine, Glycine, H=Histidinc, 
Islsoleucine, K«Lyslne, 1/= Leucine, M=Methionine, 
N»Asparogioe, P=«Pro!ine, Q=Glotamine, R^Argininc, S^Serine, 
T^Tbreonlne, V^VaUne, W^Tryptophan, Y-Tyrosine, 
X=Unknown, *«=Stop codon, /-possible nucleotide deletion, 
V=possibIc nucleotide insertion 










GLVNNAGISTFGEVEFTSLETYKQVAEVNLWGT 

VRMTKSFLPLIRRAKGRVVNISSMIXjRMANPAR 

SPYCmCFGVEAFSDCLRYEMYPLGVKVSVVEPG 

>niAATSLYSPESIQAIAKKMWEELPEVVRKDYG 

KKYFDEKIAKMETYCSSGSTDTSPVIDAVTHALT 

A11PYTRYHPMDYYWWLRMQIMTHLPGAISDM 

IYIR 


3435 


A 


842 

• 


3595 


ENQQQMLVAKEQRLHFLKQQERRQQQSISENEK 

LQKLKERVEAQENKLKKIRAMRGQVDYSKIMN 

GNLSAEIERFSAMFQEKKQEVQTAILRVDQLSQQ 

LEDLKKGKLNGFQSYNGKLTGPAAVELKRLYQE 

LQIRNQLNQEQNSKLQQQKELLNKRNMEVAMM 

DKRISELRERLYGKKIQACEKVFLNRVNGTSSPQ 

SPLSTSGRVAAVGPYIQVPSAGSFPVLGDPIKPQS 

LSIASNAAHGRSKSANDGNWPTLKQNSSSSVKP 

VQVAGADWKDPSVEGSVKQGTVSSQPVPFSALG 

PTEKPGIEIGKVPPPIPGVGKQLPPSYGTYPSPTPL 

GPGSTSSLERRKEGSLPRPSAGLPSRQRPTLLPAT 

GSTPQPGSSQQIQQRISVPPSPTYPPAGPPAFPAGD 

SKPELPLTVAIRPFLADKGSRPQS PRKGPQTVNSS 

SIYSMYLQQATPPKNYQPAAHSALNKSVKAVYG 

KPVLPSGSTSPSPLPFLHGSLSTGTPQPQPPSESTE 

KEPEQDGPAAPADGSTVESLPRPLSPTFCLTPIVHS 

PLRYQSDADLEALRRKLANAPRPLKKRSSITEPE 

GPGGPNIOKLLYORFbTTLAGGMEGTPFYOPSPSO 

DFMVTLADVDNGNTNANGNLEELPPAQPTAPLP 

AEPAPSSDANDNELPSPEPEELICPQTTHQTAEPA 

EDNNNNVATVPTTEQIPSPVAEAPSPGEEQVPPA 

PLPP ASHPPATSTNKRTNLKKPNS ERTGHGLRVR 

FNPLAIXLDASLEGEFDLVQRUYEVEDPSKPNDE 

GITPLHNAVCAGHHHIVKFLLDFGVNVNAADSD 

GWTPLHCAASCNSVHLCKQLVESGAAIFASTISD 

IETAADKCEEMEEGYIQCSQFLYGVQEKLGVMN 

KGVAYALWDYEAQNSDELSFHEGDALTTLRRKD 

E 


3436 


A 


3 

» 


2604 


GSTHASEKMKTGRSALVVTDTGDMSVLNSPRHQ 

SCIMHVDMDCFFVSVGIRNRPDLKGKPVAVTSN 

RGTGRAPLRPGANPQLEWQYYQNKILKGKADIP 

DSSLWENPDSAQANGIDSVLSRAEIASCSYEARQ 

LGIKNGMFFGHAKQLCPNLQAVPYDFHAYKEVA 

QTLYETLAS\YTHNIEAVSCDEALVDITEILAETK 

LTPDEFANAVRMEIKDQTKCAASVGIGSNILLAR 

MATRKAKPDGQYHLKPEEVDDFIRGQLVTNLPG 

VGHSMESKLASLGIKTCGDLQYMTMAKLQKEF 

GPKTGQMLYRFCRGLDDRPVRTEKERKSVSAEI 

NYGIRFTQPKEAEAFLLSLSEEIQRRLEATGMKG 

KRLTLKIMVRKPGAPVETAKFGGHGICDNIARTV 

TLDQATDNAKIIGKAMLNMFHTMKLNISDMRGV 

GIHVNQLVPTNLNPSTCPSRPSVQSSHFPSGSYSV 

RDVFQVQKAKKSTEEEHKEVFRAAVDLEISSASR 

TCTFLPPFPAHLPTSPDTNKAESSGKWNGLHTPV 

SVQSRLNLS1EVPSPSQLDQSVLEALPPDLREQVE 

QVCAVQQAESHGDKKKEPVNGCNTGILPQPVGT 

VLLQIPEPQESNSDAGINLIALPAFSQVDPEVFAA 

LPAELQRELKAAYDQRQRQGENSTHQQSASASV 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=»Alanine OCysteine, D-Aspartic Acid, 
E=Glutamic Add, F=Pbenylalanine, G-Glyctne, H^Histidine, 
I-Isoleucine, KpLysinc, L=Leudne, M»MctbJonine t 
N— Asparatrine, P^Prollne, Q^Glutamine. R^Arcinlne, S=Serine« 
T=Threonine, V=»Valine, W»Tryptophan, Y=»Tyrosine, 
X-Un known, *=Stop codon, /-possible nucleotide deletion, 
V=possible nucleotide insertion 










PKNPLLHLKAAVKJEKXRNKKK^ 

NNKLLNSPAKTLPGACGSPQKLIDGFLKHEGPPA 

EKPLEELSASTSGVPGLSSLQSDPAGCVRPPAPNL 

AGAVEFNDVKTLLREWITnSDPMEEDILQVVKY \ 

CTOLIEEKDLEKLDLVIKYMKRLMQQSVESVWN 

MAFDFILDNVQVVLQQTYGSTLKVT 


3437 

• 


A 

• 


32 


4038 


SLLRLLKAQWGSSGAASEPVVLGEEGCGFPSTNE \ 

YPDLEEERATYPQEEDRFLTPGRAQLLWSPWSPL 

DQEEACASRQLHSLASFSTVTARRNPLHNPWGM 

ELAASENTDSPSPRPLRPG VTLPP G ALTMNTKDT 

TEVAENSHHLKJUFLPKKXLECLPRCPLLPPERLRW 

NTNEEIASYUTFEKHDEWLSCAPKTRPQNGSIIL 

YNRKKVKYRKDGYLWKKRKIXjKTTREDHMKL 

KVQGMECLYGCYVHSSIVPTFHRRCYWLLQNPD 

IVLVHYLNVPAT .EOCGKGCSPIFCSISSDRREWLK 

WSREELLGQLKPMFHGIKWSCGNGTEEFSVEHL 

VQQILDTHPTKPAPRTHACLCSGGLGSGSLTHKC 

SSTKHRIISPKVEPRALTLTSIPHPHPPEPPPLIAPLP 

PELPKAHTSPSSSSSSSSSGFAEPLEIRPSPPTSRGG 

SSRGGTAILLLTGLEQRAGGLTPTRHLAPQADPR 

PSMSLAWVGTEPSAPPAPPSPAFDPDRFLNSPQR 

GQTYGGGQGVSPDFPEAEAAHTPCSALEPAAAL 

EPQAAARGPPPQSVAGGRRGNCFFIQDDDSGEEL 

KGHGAAPPIPSPPPSPPPSPAPLEPSSRVGRGEALF 

GGPVGASELEPFSLSSFPDLMGELISDEAPSIPAPT 

PQLSPALSTITDFSPEWSYPEGGVKVLITGPWTEA 

AEHYSCVFDHIAVPASLVQPGVLRCYCPAHEVG 

LVSLQVAGREGPLSASVLFEYRARRFLSLPSTQL 

DWLSLDDNQFRMSILERLEQMEKRMAEIAAAGQ 

VPCQGPDAPPVQDEGQGPGFEARVVVLVESMIP 

RSTWKGPERLAHGSPFRGMSLLHLAAAQGYARL 

LSTLSQWRSVETGSLDLEQEVDPLNVDHFSCTPL 

MWACALGHLEAAVLLFRWNRQALSIPDSLGRLP 

LSVAHSRGHVRLARCLEELQRQEPSVEPPFALSP 

PSSSPDTGLSSVSSPSELSDGTFSVTSAYSSAPDGS 

PPPAPLPASEMTMEDMAPGQLS SGVPEAPLLLM 

DYE ATNSKGPLS SLPALPP ASDDG AAPEDADSPQ 

AVDVTPVDMISLAKQUEATPERIKREDFVGLPEA 

GASMRERTGAVGLSETMSWLASYLXENVDHFPS 

STPPSEL\PFER\GRLGLSLTAPSWAEFLSCIPPVGK 

IGKLIFALLTLVSDXQEQRELYEAARVIQTAFRKYK 

GRRLKEQQEVAAAVIQRCYRKYKQLTW1AJLKFA 

LYKKMTQAAILIQSKFRSYYEQKRFQQSRRAAV 

LIQQHYRSYRRRPGPPHRTSATLPARNKGSFLTK 

KQDQAARKIMRFLRRCRHRMRELKQNQELEGLP 

QPGLAT 


3438 

• 


A 


469 


2602 


FGRLLWGTAFKSWKMKAPIPHLILLYATFTQSLK 

VVTKRGSADGCTDWSroDCKYQVLVGEPVRIKC 

ALFYGYIRTNYSLAQSAGLSLMWYKSSGPGDFE 

EPIAFDGSRMSKEEDSIWFRPTLLQDSGLYACVIR 

NSTYCMKVSISLTVGENDTGLCYNSKMKYFEKA 

ELSKSKEISCRDIEDFLLPTREPEILWYKECRTKT 

WRPSIVFKRDTLLIREVREDDIGNYTCELKYGGF 

V VRRT1 HLTVTAPLTDKPPKLLYPMESKLTIQET 

QLGDSANLTCRAFFGYSGDVSPLIYWMKGEKFEE 
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SEQH) 
ISO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 


Amino acid sequence (A«=Alanine OCysteine, D=Aspartic Add, 
E*=Glutamic Add, F«Fhcnyla)anine, G=Glycine, H=Histldtne, 
I»Isoleudne, K=Lysine, L=»Leudne, M=Methion!ne, 
N=Asparagine, P=Proline, Q=GIutnmine, R~ArginJne, S^Serine, 
T=Threonine, V=Valine, W-Tryptophan, Y=Tyrosine, 
X=»Un known, *=Stop codon, Asposstble nucleotide deletion, 
\=possible nucleotide insertion 










DLDENRVWESDI\KJLKEHLGEQEVSISUVDSVEE 
GDLGNYSCYVENGNGRKHASVLLHKRFT MYTV 

EI^GGLGAILLLLVCLVTIYKCYKIEIMLFYIWHF 

GAEELDGDNKDYDAYLSYTKVDPDQWNQETGE 

EERFALEILPDMLEKHYGYKJLFIPDRDLIPTGTYI 

EDVARCVDQSKJRillVMTPNYVVRRGWSIFELET 

RL1WMLVTGEIKVILIECSELRGIMNYQEVEALK 

HTIKLLTVIKAVHGPKCNKLNSKFWKJ^ 

KRJEPrTHEQALDVSEQGPFGELQTVSAISMAAAT 

STALATAHPDLRSTFHNTYHSQMRQKHYYRSYE 

YDVPPTGTLPLTSIGNQHTYOOTMT1JNGQRPQT 

KSSREQNPDEAHTNSAILPLLPRETSISSVIW 


3439 


A 


251 

• 


2037 

• 


GPGNSSILIGGGHLFLIRSCLi^LLLNSKENTEHT 

MAKKVAVIGAGVSGLSSIKCCVDEDLEPTCFERS 

DDIGGLWKFTERGSSLSVMIWPLALSLLRHGGFC 

YSDFPFHEDYPNFMNHEKFWDYLQEFAEHFDLL 

KYIQFKTTVCGITKRPDFSETGQWDVVTETEGKQ 

NRAVFDAVMVCTGHFLNPHLPLEAPTCIHKFKG 

QlLHSQEYKJPEGFQGKR\n-VIGLGNTGGDlAVEL 

TRRCCSFTAQVLPSRFLNWIQERKXNKRFNHEDY 

GLSITKGKKAKJFIVNDELPNCILCGAITMKTSVIE 

FTETSAVTClXjTVEENroWIFTTGYTFSFPFFEEP 

UCSLCTKJOFLYKQVFPLNLERATLAnGLIGLKGS 

ILSGTELQARWVTRVFKGLCKRPASQKLMMEAT 

EKEQ L IKRG VFKDTSKDKFD YIA YMDDIAACIGT 

KPSIPLLFLKDPRLAWEVFFGPCTPYQYR\LMGPG 

KWDGARNA1LTQWDRTLKPLKTRIVPDSSKAWP 

SM\SHYLKAWGAPVLLASLLLICK\SSLFLKLVRD 

KLQDRMSPYLVSLWRG 


3440 


A 


1 


3533 

r 

♦ 


IMPCGSSRLLRGCWTHPNEPVSDLSYFDCffiSVM 

ENSKVLGESMAGISQNAKTGDLPAFGECVGIASK 

ALCGLTEAAAQAAYLVGIFDPNSQAGHQGLVDP 

IQFARANQ AIQMACQNL VDPGS SPSQVLSAATI V 

AKHTSALCNACRIASSKTANPVAKRHFVQSAKE 

VAN STANL VKTIKALDGDFSEDNRNKCRJATAPL 

1EAVENLTAFASNPEFVSIPAQISSEGSQAQEPILV 

S AKPMLESS S YLIRTARSLAINPKDPPTWS VL AG 

HSHTVSD SIKSLITSIRDKAPGQRECD YSID GENRC 

IRDIEQASLAAVSQSLATRDDISVEALQEQLTSW 

QEIGHLIDPIATAARGEAAQLGHKGTQLASYFEP 

LILAAVGVASKIlJDHQQQNnVLDQTKTLAESAL 

QMLYAAKEGGGNPKAQHTHDAITEAAQLMKEA 

VDDIMVTLNEAASEVGLVGGMVDAIAEAMSKL 

DEGTPPEPKGTFVDYQTTVVKYSKAJAVTAQEM 

MTKSVTNPEELGGLASQMTSDYGHLAFQGQMA 

AATAEPEEIGFQIRTRVQDLGHGCIFLVQKAGXAL 

QVCPTDSYTKRELEECARAVTEKVSLVLSALQAG 

NKGTQACITAATAVSG11ADLDTTIMFATAGTLN 

AENSETFADHRENILKTAKALVEDTKLLVSGAAS 

TPDKLAQAAQSSAAT1TQLAEVVKLGAASLGSD 

DPETQWLINAIKDVAKALSDLISATKGAASKPV 

DDPSMYQlJCGAAKVMVThTVTSLLKTVKAVEDE 

ATRGTRALEATIECIKQELTVFQSKDVPEKTSSPE 

ES IRMTKGITMATAKA V AAGNS CRQED VIATAN 
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SEQ n> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«Alanlne OCysteine, D»Aspartic Add, 
E=GIntamic Acid, ^Phenylalanine, G»Glycine, H^Hlstidine, 
I=Isoleudne, K=Lysine, L^Leudne, M=Melhionlne, 
N=Asparagine, P=Proline, Q^Glutamine, R=Arginine, S=Serioe, 
T»Tbrconine, V«Valine,W»Tryptophan, Y=Tyrosine, 
X«=Unknown, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










LSRKAVSDMLTACKQASFHPDVSDEVRTRALRF 

GTECTLGYLDLLEHVLVILQKPTPELKQQLAAFS 

KRVAGAVTELIQAAEAMKGTEWVDPEDPTVIAE 

TELLGAAASIEAAAKKLEQLKPRAKPKQADETL 

DFEEQILEAAKSIAAATSALVKSASAAQRELVAQ 

GKVGSIPANAADDGQWSQGLISAARMVAAATSS 

LCEAANASVQGHASEEKLISSAKQVAASTAQLL 

VACKVKADQDSEAMRRLQAAGNAVKRASDNL 

VRAAQKAAFGKADDDDVVVKTKFVGGIAQnAA 

QEEMLKKERELEEARKKLAQIRQQQYKFLPTEL 

REDEG 


3441 


A 


3 

- 


1584 


NSARGGVGVRGARAMATVQEKAAALNLSALHS 

PAHRProFSVAQKPFGATYVWSSIINTLQTQVEV 

JOOUamLKJUmDCFVGSEAVDVIFSHLIQNKYF 

GDVDIPRAKVVRVCQALMDYKVFEAVPTKVFG 

KDKKPTFEDSSCSLYRFTTIPNQDSQLGKENKLY 

SPARYADALFKSSDIRSASLEDLWENLSLKPANS 

PHVNISTTLSPQVINEVWQEETIGRLLQLVDLPLL 

DSLLKQQEAVPKIPQPKRQSTMVNSSNYLDRGIL 

KAYSDSQEDEWLSAAIDCLEYLPDQMVVEISRSF 

PEQPDRTDL VKELLFDAIGRY YS SREPLLNHLSD 

VHNGIAELLVNGKTEIALEATQLLLKLLDFQNRE 

EFRRLLYFMAVAANPSEFKLQKESDNRMVVXRI 

FSKAIVDNKNLSKGKTDLLVLFLXMDHQKDVFKI 

PGTL\HKIVS\VK\LMAIQNGRDPNRDAGYIYCQR1 

DQRDYSNITEKTTIDELLYLLKTLDEDSKLSAKE 

KKKVLLGQFYKCHPD1FIEHFGD 


3442 


A 


160 


822 


SPASGHCRLNGAAVAMFGCLVAGRLVQTAAQQ 
VAEDKFVFDLPDYESINHVWFMLGTIPFPEGMG 
GSVYFSYPDSNGMPVWQLLGFVTNGKPSAIFKIS 
GLKSGEGSQHPFGAMNIVRTPSVAQIGISVELLDS 
MA QQTP VGN A A VSS VDSFTQFTQKMLDNF YNF 

ASSFAVSQAO>DDTQ/RPSEMFIPANVVLKWYENF 
QRRTSTEPSLLENIIWUCINF 


3443 


A 


3 

• 


1373 


SWHVRRRWLEATMAGGMKVAVSPAVGPGPWG 

SGVGGGGTVRLLL1LSGCLVYGTAETDVNWML 

QESQVCEKRASQQFCYTNVLIPQWHDIWTRIQIR 

VNSSRLVRVTQVENEEKLKELEQFSIWNFFSSFL 

KEKXNDTYVNVGL Y S TKTCLK VEIIEKDTK YS VI 

VIRRFDPKLFL VFLLGLMLFFC GDLLSRS QIF YYS 

TGMTVGrVASLVLIIIFILSKFMPKKJSPIYVILVGGW 

SFSLYLIQLVFKNLQEIWRCYWQYLLSYVLTVGF 

MSFAVCYKYGPLENERS1NLLTWTLQLMGLCFM 

YSGIQIPHIALAirilALCTKNLEHPIQWLYTTCRKV 

CKGAEKPVPPRLLTEEEYRIQGEVETRKALEELR 

EFCNSPDCSAWKTVSRIQSPKRFADFVEGSSHLT 

PNEVSVHEQEYGLGSIIAQDEIYEEASSEEEDSYS 

RCPATTQNNFLT 


3444 


A 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 
EPGXGSLGWVLPNTAMKKKVLLMGKSGSGKTS 
MRSIIFANYLARDTRRLG ATILDRIHSLQINS SLST 
YSLVDSVGNTKTFDVEH SHVRFLGNLVX.NL WDC 
GGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESR 
ELEKDMHYYQ SC LEAILQNSPD AKIFCL VHKMD 
LVQEDQRDLIFKEREEDLRRLSRPLECSCFRTSIW 
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SEQID 

NO: j 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Asparttc Add, 
E=Glutamic Acid, ^Phenylalanine, OGlycine, H=Hlstidine, 
I=Iso!eudne, K=LysiDe, LHLeueine, M^MethionJne, 
N=Asparaginc, P=Proline, Q=GIutaraine, R=Argtnine, S=Serine, 
T^Threoninc, V=Valine, W=Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=posslble nucleotide Insertion 










DETLYKAWSSIVYQLIPNVQQLEMNLRNFAEIIE 

ADEVLLFERATTLVISHYQCKEQRDAHRFEKISNI 

IKQFKLSCSKLAASFQSMEVRNSNFAAFIDIFTSN 

TYVMV\^SDPSIPSAATLINIRNARKHFEKJLERV 

DGPKQCLLMR 


3445 


A 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 

EPGXGSLG WVLPNTAMKKKVLLMGKSG SGKTS 

MRSIIFANYIARDTRRLGATILDRIHSLQINSSLST 

YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 

GGQDTFMENYFTSQRDN 1FKNVEVLI YVFDVESR 

ELEKDMHYYQSCLEAILQNSPDAKIFCLVHKMD 

LVQEDQRDLIFKEREEDLRRLSRPLECSCFRTSIW 

DETLYKAWSSIVYQLIPNVQQLEMNLRNFAEnE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 

IKQFKLSCSKLAASFQSMEVRNSNFAAFIDIFTSN 

TYVMVVMSDPSIPSAATLINIRNARKHFEKLERV 

DGPKQCLLMR 


3446 


A 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 

EPGNGSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSHFANYIARDTRRLGATILDRIHSLQINSSLST 

YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 

GGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESR 

ELEKDMHYYQSCLEAILQNSPDAKIFCLVHKMD 

LVQEDQRDLIFKEREEDLRRLSRPLECSCFRTSIW 

DETLYKAWSSIVYQLIPNVQQLEMNLRNFAEIIE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 

IKQFKLSCSKLAASFQSMEVRNSNFAAFIDlF'l'SN 

TYVMVVMSDPSIPSAATLIN1RNARKHFEKLERV 

DGPKQCLLMR 


3447 

• 


A 


1 


2930 


VLLGPLWDKLSTAJDHPVIVTMASKRKSTTPCMIP 

VKTVVLQDASMEAQPAETLPEGPQQDLPPEASA 

AS SEAAQNPSSTDGSTLANGHRSTLDGYLYSCK 

YCDFRSHDMTQFVGHMNSEHTDFNKDPTFVCSG 

CSFLAKTPEGLSLHNATCHSGEASFVWNVAKPD 

NHVVVEQSIPESTSTPDLAGEPSAEGADGQAEfflT 

KTPIMK1MKGKAEAKKIHTLKENVPSQPVGEALP 

KLSTGEMEVREGDHSFINGAVPVRQASASSAKN 

PHAANGPLIGTVPVLPAGIAQFLSLQQQPPVHAQ 

HHVHQPLPTAKALPKVMIPLSSIPTYSAAMDSNS 

FLKNSFHKFPYPTKAELCYLTVVTKYPEEQLKJW 

FTAQRLKQGISWSPEEIEDARKKMFNTVIQSVPQ 

PTITVLNTPLVASAGNVQHLIQAALPGHVVGQPE 

GTGGGLLVTQPLMANGLQATSSPLPLTVTSVPK 

QPGVAPINTVCSNTTSAVKWNAAQSLLTACPSI 

TSQAFLDASIYKNKKSHEQLSALKGSFCRNQFPG 

QSEVEHLTKVTGLSTREVRKWFSDRRYHCRNLK 

GSRAMIPGDHRSIIIDSVPEVSFSPSSKVPEVTCIPT 

TATLATHPSAKRQSWHQTPDFTPTKYKERAPEQ 

LRALESSFAQNPLPLDEELDRLRSETKMTRREIDS 

WFSERRKKVNAEETKKAEENASQEEEEAAEDEG 

GEEDLASELRVSGENGSLEMPSSHDLAERKVSPIK 

INLKmRVTEANGRNEIPGLGACDPEDDESNKLA 

EQLPGKVSCKKTAQQRHLLRQLFVQTQWPSNQD 

YDSIMAQTGLPRPEWRWFGDSRYALKNGQLK 

WYEDYKRGNFPPGLLVIAPGNRELLQDYYMTHK 
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NO: 


iTlCUlOfJ 


xi cuicicu 

beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


* ixuiccca cno 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AJanlne C=Cysteine, D=*Aspartic Acid, 
E=Glutaraic Acid, ^Phenylalanine, G=Glydne t H*=HIstidJne, 
I-Isoleucine, K»Lysine, L=Leudne, M=Methioninc, 
N=»Asparagine, P=Prollne, Q»Glutamlne, R=Arginine, S=Serine, ! 
^Threonine, V=VaIine, W-Tryptopban, Y=Tyrosine, 
X«Unknown, *=Stop codon, possible nucleotide deletion, 
^possible nucleotide insertion 










MLYEEDLQNLCDKTQMSSQQVKQWFAEKMGEE 
TRAVADTGSEDQGPGTGELTAVHKGMGDTYSE 
VSENSES WEPRVPE ASSEPFD\TS SPQAGRQLETD 


3448 


A 

* 


2 

• 


1324 


FVARAEKGFRTREAHLLQVAGVGTGLQNGASLS 

GLASGVMAQRAFPNPYADYNKSLAEGYFDAAG 

RLTPEFSQRLTNKJRELLQQMERGLKSADPRDGT 

GYTGWAGIAVLYLHLYDVFGDPAYLQLAHGYV 

KQSLNCLTKRSITFLCGDAGPLAVAAVLYHKMN 

NEKQAEIX^ITRLIHLNKIDPHAPNEMLYGRIGYIY 

AXLFS^NKNFGVEKIPQSfflQQICEmTSGENLAR 

KRNFTAKSPLMYEWYQEYYVGAAHGLAGIYYY 

LMQPSLQVSQGKLHSLVKPSVDYVCQLKFPSGN 

YPPCIGDNRDLLVHWCHGAPGVIYMLIQAYKVF 

R/EREKYLOODAYQCADVIWQYGLLKKGYGLCYN 

GSAGNAYAFLTLYNLTQDMKYLYRACKFAEWC 

LEYGEHGCRTPDTPFSLFEGMAGTIYFLXADLLFP 

TKARXFPAFEL 


3449 


A 


3 


2389 


SRHVTGAARSPSRAGPSDPPAMGDEDDDESCAV 

ELRITEANLTGHEEKVSVENFELLKVLGTGAYGK 

VFLVRKAGGHDAGKLYAMKVLRKAALVQRAK 

TQEHTRTERSVLELVRQAPFLVTLHYAFQTDAKL 

rILILDYVSGGEMFTHLYQRQYFKEAEVRVYGGE 

IVLALEHLHKLGIIYRDLKLENVLLDSEGHIVLTD 

FGLSKEFLTEEKERTFSFCGTIEYMAPEIIRSKTGH 

GKAVDWWSLGILLFELLTGASPFTLEGERNTQAE 

VSRRILKCSPPFPPRIGPVAQDLLQRLLCKDPKKR 

LGAGPQGAQEV1WHPFFQGLDWVALAARKIPAP 

FRPQIRSELDVG\NFAEEFTRLEPVYSPPGQ\PPPG 

DPRIFQGYSFVAPSILFDHNNAVMTDGLEAPGAG 

DRPGRAAVARSAMMQDSPFFQQYELDLREPALG 

QGSFSVCRRCRQRQSGQEFAVKILSRRLEANTQR 

EVAALRLCQSHPNWNLHEVHHDQLHTYLVLEL 

LRGGELLEHIRKKRHFSESEASQILRSLVSAVSFM 

HEEAGVVHRDLKPENILYADDTPGAPVK1IDFG/F 

SPRLRPQSPGVPMQTPSFTLQYAAPELLAQQGYD 

ESCDLWSLGVILYNMMLSGQAPFQGASGQGGQS 

QAAEIMCKIREGRFSLDGEAWQGVSEEAKELVR 

GLLTVDPAKRLKLEGLRGSSWLQDGSARSSPPLR 

TPDVLESSGPAVRSGLNATFMAFNRGKREGFFLK 

SVENAPImAKRRKQKLRSATASRRGSPAPANPGR 

AP VA SKG APRRANGPLPPS 


3450 


A 


201 


1705 


KG1EMNKSRWQSRRRHGRRSHQQNPWFRLRDS 

EDRSDSRAAQPAHDSGHGDDESPSTSSGTAGTSS 

WI^PGITfFDPEKKRYFRLU^GH^CWLTKESIR 

QKEMESKRLRLLQEEDRRKKIARMGFNASSMLR 

KSQLGFLNVThTYCHLAHELRl^CMERKKVQIRS 

MDPSALASDRFha.ILADTNSDRLFTVNDVTVGGS 

KYGIl^QSLKIPllJKVFMHENLYFrNl^ 

CWASLNHLDSHILLCLMGLAETPGCATLLPASLF 

VNSHPAGIDRPGVMLCSFRIPGAWSCAWSLNIQA 

NNCFSTGLSRRVLLTNWTGHRQSFGTNSDVLA 

QQFALMAPLLFNGCRSGEIFAIDLRCGNQGKGW 

KATRLFHDSA\n^VRILQDEQYI>lASDMAGKIK 

LWDLRTTKCVRQYEGHVNEYAYLPLHVHEEEGI 

LVAVGQDCYTRIWSLHDARLLRTIPSPYPASKAD 
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SEQro 
NO: 



Method 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residne of 
peptide 
sequence 



Amino acid sequence (A»Alanine OCystcinc, D»Aspartic Acid, 
E"Glutamic Acid, ^Phenylalanine, G^Glycine, H«Histidlne, 
I a Isoleucine, K^Lysine, L^Leudne, M=Methionine, 
N=Asparaginc, P=Proline, Q=Glutamine, R«Arginine, S^Serine, 
T=Threonine, V«Valine, W«Tryptophao, Y«Tyroslne, 
X=Un known, *=Stop codon, /"possible nucleotide deletion, 
^possible nucleotide insertion 



IPSVAFSSRLGGSRGAPGLLMAVGQDLYCYSYS 



3451 



19 



6033 



LLSAMLSHGAGLALW1TLSLLQTGLAEPERCNFT 

LABSKASSHSVSIQWRILGSPCNFSLIYSSDTLGA 

ALCrnfFRIDKrTYGCNlXJDLQAGTrYW 

ERTVVLQTDPLPPARFGVSKEKTTSTGUWWWT 

PSSGKVTSYEVQLFDENNQKIQGVQIQESTSWNE 

YTrTNLTAGSKYNIAITAVSGGKRSFSVYTNGST 

VPSPVKDIGISTKANSLLISWSHGSGNVERYRLM 

LMDKGILVHGGWDKHATSYAFHGLSPGYLYNL 

TVMTEAAGLQNYRWKLVRTAPMEVS^^KVTND 

GSLTSLKVKWQRPPGVNVDSYNrTLSHKGTIKESR 

VLAPWIT\ETHFKELVPGRLY\QVTCSAVSLGELS 

AQKM\AVGRTFPDKVANLEANNNGRMRSLVVS 

WSPPAGDWEQYRILLFNDSVVLLNITVGKEETQ 

YVMDGTGLVPGRQYEVEVIVESGNLKNSERCQG 

RTWLAVLQLRVKHANETSLSIMWQTPVAEWEK 

YIISLADRDLLLIHKSLSKDAKEFTFTDLVPGRKY 

MATVTSISGDLKNSSSVKGRTVPAQVTDLHVAN 

QGMTSSLFTNWTQAQGDVEFYQVLLIHENVVIK 

NESISSETSRYSFHSLKSGSLYSVWTTVSGGISSR 

QVVVEGRTWSSVSGVTVNNSGRNDYLSVSWLL 

APGDVDNYEVTLSHDGKWQSLVIAKSVRECSF 

SSLTPGRLYTVTITTRSGKYENHSFSQERTVPDKV 

QGVSVSNSARSDYLRVSWVHATGDFDHYEVTIK 

NKNNHQTKSEPKSENECVFVQLVPGRLYSVTVT 

TKSGQYEANEQGNGRTTPEPVKDLTLRNRSTEDL 

HVTWSGANGDVDQYEIQLLFNDMKVFPPFHLVN 

TATEYRFTSLTPGRQYKILVLTISGDVQQSAFIEG 

FTVPSAVKNIHISPNGATDSLTVNWTPGGGDVDS 

YTVSAFRHSQKVDSQTIPKHVFEHTFHRLEAGEQ 

YQIMIASVSGSLKNQINWGRTVPASVQGVIADN 

AYSSYSLIVSWQKAAGVAERYDILLLTENGILLR 

NTSEPATTKQHKFEDLTPGKKYKIQILTVSGGLFS 

KEA QTEGRT VPAAVTDLRITENSTRHLSFR WTA S 

EGELSWYNIFLYNPDGNLQERAQVDPLVQSFSFQ 

NLLQGRMYKMVIVTHSGELSNESFIFGRTVPASV 

SHLRG SNRNTTDSL WFNWSPASGDFDFYELILYN 

PNGTKKENWKDKDLTEWRFQGLVPGRKYVLW 

WTHSGDLSNKVTAESRTAPSPPSLMSFADIANT 

SIAITWKGPPDWTDYNDFELQWLPRDALTVFNP 

YWrRKSEGRIVYGLRPGRSYQFhTVTKTVSGDSWK 

TYSKPIFGSVRTKPDKIQNLHCRPQNSTAIACSWI 

PPDSDFDGYSIECRKMDTQEVEFSRKLEKEKSLL 

NIMMLWHKRYLVSIKVQSAGMTSEVVEDSTIT 

MmRPPPPPPHIRVNEKDVLISKSSINFTVNCSWFS 

DTOGAVKYFTVVVREADGSDELKPEQQHPLPSY 

LEYRHNASIRVYQTNYFASKCAENPNSNSKSFNI 

KLGAEMESLGGKCDPTQQKFCDGPLKPHTAYRI 

SIRAFTQLFDEDLKEFTKPLYSDTFFSLPnTESEP 

LFGAIEGVSAGLFLIGMLVAWALLICRQKVSHG 

RERPSARLSIRRDRPLSVHLNLGQKGNRKTSCPDC 

INQFEGHFMKLQADSNYLLSKEYEELKDVGRNQ 

SCDIAIXPENRGKNRYNNILPYDATRVKLSNVDD 

DPCSDYINASYIPGNNFRREYIVTQGPLPGTKDDF 

WKMVWEQNVHNTVMVTQCVEKGRVKCDHYW 



340 



WO 01/57190 



PCT/US01/04098 



SEQXD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCystdne, D=Aspartic Acid, 
E=Glutamlc Acid, ^Phenylalanine, G^GIycine, H«H)stidine, 
I=Isoleudne, K=Lysine, L^Leucine, M 8 Methiooine, 
N^Asparagine. P=Proline, Q=Glutamine, R^Areinine. S=Serine« 
T-Threonine, V-Valine, W«Tryptophao, Y«Tyrosine, 
X«Unknown, *-Stop codon, ^possible nucleotide deletion, 
\=possible nudeotide insertion 










PADQDSLYYGDLILQMLSESVLPEWTIREFKICGE 

EQLDAHRLIRHFHyTVWPDHGVPETTQSLIQFVR 

TVRDYINRSPGAGPTVVHCSAGVGRTGTF1ALDR 

ILQQIX>SKDSVDIYGAV\HDLRLHRVHMVQTEC 

QYVYlJSQCVRDVLRARKLRSEQEhfPLFPIYENV 

NPEYHRDPVYSRH 


3452 


A 


63 


1073 


FFRSSSDKGSPIRQYE/HSTPAHQGPVMGLEGKS/ 

ARNSQLRIVLVGKTGAGKSATGNSILGRKVFHSG 

TAAKSITKKCEKRS S S WKETELVWDTPGEFDTE 

VPNAETSKEIIRCILLTSPGPHALLLVVPLGRYTEE 

EHKATEKILKMFGERARSFMILIFIRKDDLGDTN 

LHDYLREAPEDIQDLMDIFGDRYCALNNKATGA 

EQEAQRAQLLGLIQRWRENKEGCYTNRMYQR 

AEEEIQKQTQAMQELHRVELEREKARIREEYEEK 

IRKLEDKVEQEKRKKQMEKJKLAEQEAHYAVRQ 

QRARTEVESKDGILELIMTALQIASFILLRLFAED 


3453 

* 


A 


2674 


514 


GPITFLKKKAKMKDMPLRIHVLLGLAJTTLVQAV 

DKXVDCPRLCTCEIRPWFITRSIYMEASTVDCMD 

LGLLTFPARLPANTQELLLQTNNIAKEBYSTDFPV 

>0,TGLDLSQNT^SSVTNmGKKMPQLLSVYLEEN 

KLTELFEKCI^ELSNLQELYINHNLLSTISPGAFIG 

LHNLLRLHLN SNRLQMINSK WFD ALPNLEILMIG 

ENPHRIKDMNFKPLINLRSLVIAGINLTEIPDNAL 

VGLENLESISFYDNRLIKVPHVALQKWNLKFLD 

LhOCOTIhnURRGDFSNMLHLKELGINNMPfeLISID 

SLAVDNLPDLRKBEATNNPRLSYIHPNAFFRLPKL 

ESLMLNSNALSALYHGTIESLPNLKEISIHSNPIRC 

DCVIRWMNMNKTNIRFMEPDSLFCVDPPEFQGQ 

NVRQ VHFRDMMEICLPLIAPESFPSNLNVEAG S Y 

VSFHCRATAXEPQPEIYWITPSGQKLLPNTVLTDKF 

YVHSEGTLDINGVTPKEGGLYTCIATNLVGADLK 

SVMIKVDGSFPQDNNGSLNIKIRDIQANSVLVSW 

KASSKILKSSVKWTAFVKTENSHAAQSARIPSDV 

KV YNLTHLNPSTE YKIdDLKll YQKNRKKC VNVT 

TKGLHPDQKEYEKNNrilLMACLGGLLGnGVIC 

LISCLSPEMNCDGGHSYVRNYLQKPTFALGELYP 

PLINLWEAGKEKSTSLKVKATVIGLPTNMS 


3454 


A 


1844 


244 


ERYIJvATYVAPSATIJDIGLQQEKKKEIYMKIQPP 

FEDLFDTAJE£Y1LLLLLEPWTKMVKSDQIAYKKV 

ELVEETRQLDSTYFRKLQALHKETFSKKAEDTTC 

EIGTGILSLSNVSKRTEYWDNVPAEYKHFKFSDL 

LNNKLEFEHFRQFLETHSSSMDLMCWTDIEQFRR 

ITYRDRNORKAKSIY1KNKYLNKKYFFGPNSPAS 

LYQQNQVMHLSGGWGKILHEQLDAPVLVEIQK 

HVQNRLENVWLPLFLASEQFAARQKIKVQMKDI 

AEELLLQKAEKKIGVWKPVESKWISSSCKIIAFRK 

ALLNPVTSRQFQRFVALKGDLLENGLLFWQEVQ 

KYKDLCHSHCDESVIQKKirrilNCFINSSIPPALQI 

DIPVEQAQKnEHRKELGPYVFREAQMTTLGVMF 

KFWPQFCEFRKmTDEMMSVLERRQEYNKQKK 

KLA VL/QNDEKS GKDGDCQYANTS VPAIKTALLS 

DSFLGLQPYGRQPTWCYSKYIEALEQERILLKIQE 

ELEKXSCLQACNLSQILRLALQLCL 


3455 


A 


228 


3330 


APTAQ AMMSFG GADALLGAPF APLHG GGSLH Y j 
ALARKGGAGGTRSAAGSSSGFHSWTRTSVSSVS 
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SEQ1D 
NO: 


1 Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


i Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= Ala nine OCysteine, D=Aspartic Acid, 
E=Glutomic Acid, F=PhenylalaDJne, G=Glydnt, H=Histidine, 
I=Isoleudne, K=Lysine, LoLeucioe, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T«Threonlne, V^Valine, \V=Tryptophan, Y=Tyrosine, 
X=Unknown, *=«Stop codon, /^possible nucleotide deletion, 
V=possib!e nudeotide insertion 










ASPSRFRGAGAASSTDSLDTLSNGPEGCMVAVA 

TSRSEKEQLQALNDRFAGYIDKVRQLEAHNRSLE 

GEAAALRQQQAGRSAMGELYEREVREMRGAVL 

RLGAARGQLRLEQEHLLEDIAHVRQRLDDEARQ 

REEAE AAARAL ARF AQEAE A A RVDLQKKAQ AL 

QEECGYLRRHHQEEVGELLGQIQGSGAAQAQM 

QAETRDALKCDVTSALRE1RAQLEGHAVQSTLQ 

SEEWFRVRLDRLSEAAKVNTDAMRSAQEEITEY 

RRQLQARTTELEALKSTKDSLERQRSELEDRHQA 

DIASYQEAIQQLDAELRNTKWEMAAQLREYQDL 

LNVKMALDIEIAAYRKLLEGEECRIGFGPIPFSLP 

EGLPKIPSVSTHIKVKSEEKIKVVEKSEKETVIVEE 

QTEETQVTEEVTEEEDKEAKEEEGKEEEGGEEEE 

AEGGEEETKSPPAEEAASPEKEAKSPVKEEAKSP 

AEAKSPEKEEAKSPAEVKSPEKAKSPAKEEAKSP 

PEXAKSPEKDGKQNFQAEVKSPEKAKSPAKEEAK 

SPAEAKSPEKAKSPVKEEAKSPAEAKSPVKEEAK 

SPAEVKSPEKAKSPTKEEXAKSPEKAKSPEKAKSP 

PIT PP a ic CPP if cp \nc AFAK ^PF KA K SP VKAE A 

KSPEKAKSPVKEEAKSPEKAKSPVKEEAKSPEKA 

KSPVKEEAKTPEKAKSPVKEEAKSPEKAKSPEKA 

KTLDVKSPEAKTPAKEEARSPADKFPEKAKSPVK 

EEVKSPEKAKSPLKEDAKAPEKEIPKKEEVKSPV 

KEEEKPQEVKVKEPPKXAEEEKAPATPKTEEKK 

DSKKPJBAPKKEAPKPKVEEKKEPAVEKPKESKV 

EAKKEEAEDKKKVPTPEKEAPAKVEVKEDAKPK 

EKTEVAKKEPDDAKAKEPSKPAEKKEAAPEKKD 

TKEEKAKKPEEKPKTEAKAKEDDKTLSKEPSKP 

KAEKAEKSSSTDQKDSKPPEKATEDKAAKGK j 


3456 

j 


A 


258 


1463 


YLSFIPGHASKS APMNGHCFAENGPSQKS SLPPLL 

TPPSENLGPHEEDOVVCGFKKLTVNGVCASTPPL 

TPIKNSPSLFPCAPLCERGSRPLPPLPISEALSLDDT 

DCEVEFLTSSDTDFLLEDSTLSDFKYDVPGXRRSF 

RGCGQINYAYFDTPAVSAADLSYVSDQNGXGVP 

DPNPPPPQTHPJU.RRSHSGPAGSFNKPAIRISNCCI 

HRASPNSDEDKPEVPPRVPIPPRPVKPDYRRWSA 

EVTSSTYSDEDRPPKVPPREPLSPSNSRTPSPKSLP 

SYLNGVMPPTQSFAPDPKYVSSKALQRQNSEGS 

ASKVPCDLP1IENGKKVSSTHYYLLPERPPYLDKY 

EKFFREAKKKNGGAQIQPLPADCGISSATEKPDS 

KTKMDLGGHVKRKHLSYVGTP 


3457 


A 


2 


4869 


FILSSSSSASSEHFHHHYSFGNWWPGSFKGHRMS 

LPFYQRCHQHYDLSYRNKDVRSTVSHYQREKKR 

SAVYTQGSTAYSSRSSAAHRRESEAFRRASASSS 

QQQASQHALSSEVSRKAASAYDYGSSHGLTDSS 

LLLDDYSSKLSPKPKRAKHSLLSGEEKENLPSDY 

MVPIFSGRQKHVSGITDTEEERIKEAAAYIAQRNL 

LASEEGITTPKQSTASKQTTA SKQSTASKQSTASK 

QSTASRQSTASRQSVVSKQATSALQQEETSEKKS 

RKWIRGKAERLSLRKTLEETETYHAKLNEDHLL 

HAPEFIIKJPRSHTVWEKENVKJLHCSIAGWPEPRV 

TWYKNQVPIKVHANPGKYTTESRYGMHTLEINAC 

DFEDTAQYRASAMNVKGELSAYASVWKRYKG 

EFDETRFHAGASTMPLSFGVTPYGYASRFEIHFD 

DKFDVSFGREGETMSLGCRWITPEIKHFQPEIQ 
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SEQID 
NO: 


Method 

• 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino odd sequence (A-Alanine OCysteinc, D=Aspartic Acid, 
E«G!utamfc Add, F»Phenylalanine, OGIyclne, H^Hlsridine, 
Islsoleudne, K-Lysinc, I/»Leucine, M=Methionioe, 
N»Asparagine, P-Proline, Q^Glutamine, R=Arginf ne, S=Serlne, 
T-Threonine, V«Va!ine, W=Tryptophan, Y«Tyrosine, 
X-Unknown, *«Stop codon, /^possible nucleotide deletion, 
^possible nudeotide insertion 










WYRNGVPLSPSKWVQTLWSGERATLTFSHLNKJE 

DEGLYTIRVRMGEYYEQYSAYVFVRDADAEIEG 

APAAPLDVKCLEANKDYmSWKQPAVDGGSPIL 

GYFIDKCEVGTDSWSQCNDTPVKFARFPVTGLIE 

GRSYIFRVRAVNKMGIGFPSRVSEPVAALDPAEK 

ARLKS/PPLSTLDWTWIVTEEEPSEGIVPGPPTDLS 

VTEATRS YWLS WKPPG QRGHEGIMYFVEKCEA 

GTENWQRVNTELPVKSPRFALFDLAEGKSYCFR 

VRCSNSAGVGEPSEATEVTWGDKLDIPKAPGKI 

IPSRNTDTSVVVSWEESKDAKELVGYYIEANVA 

GSGKWEPCKNNPVKTHRFTCHGLVTGQSYIFRV 

RAVNAAGLSEYSQDSEABEVKAAIAPPSPPCDITC 

LESFRDSMVLGWKQPDKIGGAEITGYYVNYREV 

IDGWGKWREANVKAVSEEAYKISNLKENMVY 

QFQVAAMNMAGLGAPSAVSECFKCEEWTIAVP 

GPPHSLKCSEVRKDSLVLQWKPPVHSGRTPVTG 

YFVDLKEAKAKEDQWRGLNEAAIKNVYLKVRG j 

LKEGVSYVFRVRAINQAGVGKPSDLAGPVVAET 

RPGTKEVWNVDDDGVISLNFECDKMTPKSEFS 

WSKDYVSTEDSPRLEVESKGNKTKMTFKDLGM 

DDLGIYSCDVTDTDGIASSYLIDEEELKRLLALSH 

EHKFPTVPVKSELAVEILEKGQVRFVWMQAEKLS 

GNAKVNYn^NEKGIFEGPKYKMHIDRNTGIIEMF 

MEKLQDEDEGTYTFQLQDGKATNHSTWLVGD 

VrJsJsJ^l^KJbAJfcr t^KA^Jb W U\JMs^Ox^rix* VliXJL& WJ&V 1 

GECNVLLKCKVANIKKETHIVWYKDEREISVDE 

KHDFKDGICTLLITEFSKKDAGIYEVILKDDRGK 

DKSRLKLVDEAFKELMMEVCKK1ALSATDLKIQ 

STAEGIQLYSFVTYYVEDLKVNWSHNGSAIRYSD 

RVKTGVTGEQIWLQINEPTPNDKGKYVMELFDG 

KTGHQKTVDLSGQAYDEAYAEFQRLKQAAIAEK 

NRARVLGGLPDVVTIQEGKALNLTCNVWGDPPP 

EVS WLKNEKALASDDHCNLKFEAGRTAY KI'ING 

VSTADSGKYGLVVKNKYGSETSDFTVSVFIPEEE 

ARMAALESLKGGKKAK 


3458 


A 


3963 


827 


LSRS S SDNNTNTLGRNVM ST ATSPLMG AQ SFPNL 

TTPGTTSTVTMSTSSVTSSSNVATATTVLSVGQS 

LSNTLTTSLTSTSSESDTGQEAEYSLYDFLDSCRA 

STLLA FT .DDDEDLPEPDEEDDENEDDNQEDQEY 

EEVMILRRPSLQRRAGSRSDVTHHAVTSQLPQVP 

AGAGSRPIGEQEEEEYETKGGRRRTWDDDYVLK 

RQFSALVPAFDPRPGRTNVQQTTDLEIPPPGTPHS 

ELLEEVECTPSPRLALTLKVTGLGTTREVELPLTN 

FRSTIFYYVQKLLQLSCNGNVKSDKLRRIWEPTY 

TTMYREMKDSDKEKENGKMGCWSIEHVEOYLG 

TDELPKNDLITYLQKNADAAFLRHWKLTGTNKS 

IRKNRNCSQLIAAYWDLG\EHGTK\SGLNQGAIST 

LQSSDILNLTKEQPQAKAGNGQNSCGVEDVLQL 

LRILYTVASDPYSRISQEDGDEQPQFTFPPDEFTS/ 

KKITTKILQQIEEPLALASGALPDWCEQLTSKCPF 

LIPFETRQLYFTCTAFGASRAIVWLQNRREATVE 

RTRTTSSVRRDDPGEFRVGRLKHERVKVPRGESL 

MEWAENVMQIHADRKSVLEVEFLGEEGTGLGPT 

LEFYALVAAEFQRTDLGAWLCDDNFPDDESRHV 

DLGGGLKPPGYYVQRSCGLFTAPFPQDSDELERI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCystefne, D^Aspartic Acid, 
E-Glutamic Acid, ^Phenylalanine, G«Glyeine, H»Hlstidine, 
I»lsoleudne t K=Lysine, IHLeucine, M=Methionine, 
N»Asparogine, P=Proline, Q=Glutamine, R^Arginine, S^Serinc, 
"^Threonine, V-Valine, W-Tryptophan, Y^Tyrorinc, 
X»Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 1 










TKLFHFLGIFLAKCIQDNRLVDLPISKPFFKLMCM 

GDIKSNMSKLIYESRGDRDLHCTESQSEASTEEG 

HDSLSVGSFEEDSKSEFILDPPKPKPPAWFNGILT 

WEDFELVNPHRARFUCEIKDLAIKRRQILSNKGL 

SEDEKNTKLQELVLKNPSGSGPPLSIEDLGLNFQF 

CPSSRIYGFTAVDLKPSGEDEMITMDNAEEYVDL 

MFDFCMHTGlQKQMEAFRDGFNKVFPMEKLSSF 

SHEEVQMILCGNQSPSWAAEDIINYTEPKLGYTR 

DSrKjKLRFVRVI^GMSSDERKAFLQFTTGCSTLP 

PGGLANLHPRLTVVRKVDATDASYPSVNTCVHY 

LIO.PEYSSEEIMRERLLAATMEKGFHLN 


3459 


A 


88 


603 


SCGPRGLASLGLGFSGRCDDQNKGRS\DGPEAQA 

EACSGERTYQELLVNQNPIAQPLASRRLTRKLYK 

CIKXAVKQKQIRRGVKEVQKFS^NKGEKGIMVLA 

GDTLPmVYCHLPVMCEDRNLPYVYIPSKTDLGA 

AAGSKRPTCVIMVKPHEEYQEAYDECLEEVQSL 

PLPL 


3460 


A 


139 


1997 

• 


QVTNMSDKSELKAELERKKQRJLAQIREEKKRKE 

EERKKKFl'DQKKEAVAPVQEESDLEKKRREAEA 

LLQSMGLTPESPIVPPPMSPSSKSVSTPSEAGSQD 

SGDGAVGSRRGPKLGMAKITQVDFPPREIVTYT 

KETQTPVMAQPKEDEEEDDDWAPKPPIEPEEEK 

TLKKDEENNDSKAPPHELTEEEKQQILHSEEFLSFF 

DHSTRIVERALSEQINJJrFDYSGRDF/ENDKEGEIQ 

AGAKLSLNRQFFVDERXWSKASGWVSCLDWSSQ 

YPNELLVASYNNNEDAPHEPDGVALVWNMKYK 

KTTPEYVFHCQSAVMSATFAKFHPNLVVGGTYS 

GQIVLWDNRSNKRTPVQRTPLSAAAHTHPVYCV 

mrVGTQNAHNLISISTDGKICSWSLDMLSHPQDS 

MELVHKQSKAVAVTSMSFPVGDVNNFWGSEE 

GSVYTACRHGSKAGISEMFEGHQGPITGIHCHAA 

VGAVDFSHLYVTSSFDWTVKLWTTKNNKPLYSF 

EDNAGYVYDVMWSPTHPALFACVDGMGRLDL 

WNLNNDTEVPTASISVEGNPALNRVRWTHSGRE 

IAVGDSEGQIVIYDVGEQIAVPRNDEWARFGRTL 

AEINANRADAEEEAATRIPA 


3461 


A 


139 


1997 

• 


QVTNMSDKSELKAELERKKQRLAQIREEKKRKE 

EERKKKJBTD QKKEA VAPVQEESDLEKKRREAEA 

LLQSMGLTPESPIVPPPMSPS SKS VSTPSEAGSQD 

SGJpGAVGSRRGPIKLGMAKrrQVDFPPREIVTYT 

KETQTPVMAQPKEDEEEDDDVVAPKPPIEPEEEK 

TLKKDEENVDSKAPPHELTBEEKQQILHSEEFLSFF 

DHSTRIVERALSEQINIFFDYSGRDF/ENDKEGEIQ 

AGAKLSLNRQFF\DER\WSKASG WVSCLD WSSQ 

YPNELLVASYNNNEDAPHEPDGVALVWNMKYK 

KTTPEYVFHCQSAVMSATFAKFHPNLVVGGTYS 

GQIVLWDNRSNKRTPVQRTPLSAAAHTHPVYCV 

NWGTQNAHNLISISTDGKICSWSLDMLSHPQDS 

MELVHKQSKAVAVTSMSFPVGDVNNFVVGSEE 

GSVYTACRHGSKAGISEMFEGHQGPITGIHCHAA 

VGAVDFSHLYVTSSroWTVKXWTTKNNKPLYSF 

EDNAGYVYDVMWSPTHPALFACVDGMGRLDL 

WNLNNDTEVPTASISVEGNPALNRVRWTHSGRE 

IAVGDSEGQIVIYDVGEQIAVPRNDEWARFGRTL 

AEINANRADAEEEAATRIPA 
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SEQID | Method 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 



Predicted cod 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=Alanine OCysteine, D=Aspartic Add, 
E=€lutatmc Acid, ^Phenylalanine, C=Glycinc, H»Histidtne, 
l^lsolcucine, K»LysIne, L=Leucine, M=Metbionine, 
N=Asparagine, P=Proline, Q=Glutamine, R-Arginine, S=Serine, 
T=OTireonlne, VoValine, W»Tryptopban, Y=Tyrosine, 
X=Un known, *<=Stop codon, A=possible nucleotide deletion, 
\=possible nucleotide insertion 



2643 



TAPEFSRSTHASAHASVARVLRNREIAQLKKEQR 

RQEFQIRALESQKRQQEMVLRRKTQEVSALRRL 

AKPMSERVAGRAGLKPPMLDSGAEVSASTTSSE 

AESGARSVSSIVRQWNRKINHFLGDHPAPTVNGT 

RPARKKFQKKGASQSFSKAARLKWQSLERRUDI 

VMQRMTTWLEADMEiaiKKREELFLLQEALRR 

KRERLQAESPEEEKGLQELAEEIEVLAANIDYIND 

GITDCQATIVQLEETKEELDSTDTSVVISSCSLAE 

ARLL1JDNF1JCASIDKGLQVAQKEAQIRLLEGRLR 

QTDMAGSSQNHLLLDAiRBKAEAHPELQALIYN 

VQQENGYASTDEEISEFSEGSFSQSFTMKGSTSH 

DDFKFKSEPKLSAQMKAVSAECLGPPLDISTKNI 

TKSLASLVEIKEDGVGFSVRDPYYRDRVSRTVSL 

PTRGSTFPRQSRATETSPLTRRKSYDRGQPIRSTD 

VGFTPPSSPPTRPRNDRNVFSRLTSNQSQGSALD 

KSDDSDSSLXSEVLRGIISPVGGAKGARTAPLQCV 

SMAEGHTKPILCLDATDELUnrGSKDRSCKMWN 

LVTGQEIAALKGHPNNWSIKYCSHSGLVFSVST 

SYIKVWDIRDSAKCIRTLTSSGQVISGDACAATST 

RAITSAQGEHQINQIALSPSGTMLYAASGNAVRI 

WELSRFQPVGKLTGHIGPVMCLTVTQTASQHDL 

VVTGSKDHYVKMFELGECVTGTIGPTHNFEPPH 

YDGIECLAIQGDILFSGSRDNGIKKWDLDQQELIQ 

QIPNAHKDWVCALAFIPGRPMLLSACRAGVIKV 

WNVDNFTPIGEIKGHDSPINAJCTNAKHIFTASSG 

CRVKVWNYVPGLTPCLPRRVLAIKGRATTLP 



198 



3146 



SGEPRPEPGNMATCIGEKIEDFKVGNLLGKGSFA 

GVYRAESIHTGLEVAIKMIDKKAMYKAGMVQR 

VQNEVKIHCQLKHPSILELYNYFEDSNYVYLVLE 

MCHNGEMNRYLKNRVKPFSENEARHFMHQIITG 

MLYLHSHGILHRDLTLSNLLLTRNMNIKIADFGL 

ATQLKMPHEKHYTLCGTPNYISPEIATRSAHGLE 

SDVWSLGCMFYTLLIGRPPFDTDTVKNTLNKVV 

LADYEMPTFLSIEAKDLfflQLLRRNPADRLSLSSV 

LDHPFMSRNSSTKSKDLGTVEDSIDSGHATISTAI 

TASSSTSISGSLFDKRRLLIGQPLPNKMTWPKNK 

SSTDFSSSGDGNSFYTQWGNQBTSNSGRGRVIQD 

AEERPHSRYLRRAYSSDRSGTSNSQSQAKTYTM 

ERCHSAEMLSVSKRSGGGENEERYSPTDNNANEF 

NFFKEKTSSSSGSFERPDNNQALSNHLCPGKTPFP 

FADPTPQTETVQQWFGNLQINAHLRKTTEYDSIS 

PNRDFQGHPDLQKDTSKNAWTDTKVKKNSDAS 

DNAHSVKQQNTMKYMTALHSKPEIIQQECVFGS 

DPLSEQSKTRGMEPPWGYQNRTLRSITSPLVAHR 

LKPIRQKTKKAVVSILDSEEVCVELVKEYASQEY 

VKEVLQISSDGNTITIYYPNGG\RGFPLA\DRPPSP 

T\DNISR\YSF\DNLPEKYWRKYQYASRFVQLVRS 

KSPKJTYFTRYAKCILMENSPGADFEVWFYDGV 

KIHKTEDnQVEEKTGKSYTLKSESEVNSLKEEIK 

MYMDHANEGHRICLALESESEEERKTRSAPFFPn 

IGRKPGSTSSPKALSPPPSVDSNYPTRDRASFNRM 

VMHSAA5PTQAPILNPSMVTNEGLGLTTTASGTD 

ISSNSLKDCLPKSAQLLKSVFVKNVGWATQNLTS 

GAVWVQFNDGSQLWQAGVSSISYTSPNGQXTTR 

\YGENEKLPDYIKQKLQCLSSILLMFSNPTPNFH 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«=Alonine OCysteine, D=Aspartie Acid, 
E*=Glntamic Add, ^Phenylalanine, G=G!ycine, HNHistidine, 
I^lsoleurine, K-Lysine, L^Leudne, M=Methiooinc, 
N=Asparagine, P=Proline, Q=Glutamine, R«Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan t Y«Tyrosine, 
X=Unkno>vn, *=Stop cod on, /^possible nucleotide deletion, 
V-possible nucleotide insertion 


3464 


A 


14 


348 


AVRTVSGTSLGPRSHSRSPGRCHCFSAVTFSSPRL 
AASEAPDPMEEWDVPQMKKEVESLKYQLAFQR 
EMASKTIPELLKWIEDGIPKDPFLNPDLMKNNPW 
VXEKGKCTIL 


3465 

* 


A 


5537 


405 


VRKLDRERVGAWWRGAWARHPRQEAGEHAKR 

RKGHAETPRGRRKGRAGRSAAAVGELRPARRSL 

ETSRAAAAMAKDSPSPLGASPKKPGCSSPAAAV 

LENQRRELEKLRAFT ,FAERAG WRAERRRFAARE 

RQLREEAERERRQLADRLRSKWEAQRSRELRQL 

QEEMQREREAEIRQLLRWKEAEQRQLQQLLHRJB 

RDGVVRQARELQRQLAEELVNRGHCSRPGASEV 

S AAQCRCRLQEVLA QLRWQTDGEQA ARIRYLQ 

AALEVERQLFLKYILAHFRGHPALSGSPDPQAVH 

SLEEPLPQTSSGSCHAPKPACQLGSLDSLSAEVG 

VRSRSLGLVSSACSSSPDGLLSTHASSLDCFAPAC 

SRSLDSTRSLPKASKSEERPSSPDTSTPGSRRLSPP 

PSPLPPPPPPSAHRKLSNPRGGEGSESQPCEVLTPS 

PPGLGHHEUKLNWLLAKALWVLARRCYTLQEE 

NKQLRRAGCPYQADEKVKRLKVKRAELTGLAR 

RLADRARELQETNLRAVSAPIPGESCAGLELCQV 

FARQRARDLSEQASAPLAKDKQIEELRQECHLLQ 

ARVASGPCSDLHTGRGGPCTQWLNVRDLDRLQ 

RESQREVLRLQRQLMLQQGNGGAWPEAGGQSA 

TCEEVRRQMLALERELDQRRRECQELGAQAAPA 

RRRGEEAETQLQAALLKNAWLAEENGRLQAKT 

DWVRKVEAENSEVRGHLGRACQERDASGLIAEQ 

LLQQ AARGQDRQQ QLQRDPQKALCDLHPS WKEI 

QALQCRPGHPPEQPWETSQMPESQVKGSRRPKF 

HA RAED Y A V SQPNRDI QEKRE AS LEESP V ALGES 

ASVPQVSETVPASQPLSKKTSSQSNSSSEGSMWA 

TVPSSPTLDRDTASEVDDLEPDSVSLALEMGGSA 

APAAPKLJOTMAQYNYNPFEGPNDHPEGELPLTA 

GDYIYIFGDMDEDGFYEGELEDGRRGLVPSNFVE 

QIPDS YIPG CLPAKSPDLGPSQLPAGQDEALEED S 

LLSGKAQGWDRGLCQMVRVGSKTEVATEDLDT 

KTEACQLGLLQSMGKQGLSRPLLGTKGVLRMAP 

MQLHLQNVTATSANITWVYSSHRHPHWYLDD 

REHALTPAGVSCYTFQGLCPGTHYRARVEVRLP 

RDLLQ VY WGTMSSTVTFDTLLAGPPYPPLD VL V 

ERHASPGVLWSWLPVTIDSAGSSNGVQVTGYA 

VYADGLK VCEVADATA GSTLLEFSQLQVPLTWQ 

KVSVRTMSLCGESLDSVPAQIPEDFFMCHRWPET 

PPFSYTCGDPSTYRVTFPVCPQKLSLAPPSAKASP 

HNPGSCGEPQAKFLEAFFEEPPRRQSPVSNLGSE 

GECPSSGAGSQAQELAEAWEGCRKDLLFQKSPQ 

NHRPPSVSDQTGEKENCYQHMGTSKSPAPGFIHL 

RTECGPRKEPCQEKAALERVLRQKQDAQGFTPP 

QIX3ASQQYASDFHNVLKEEQEALCLDLWGTERR 

EERREPEPHSRQGQALGVKRGCQLHEPSSALCPA 

PSAKVIKMPRGGPQQLGTGANTPARVFVALSDY 

NPLVMSANLKAAEEELVFQKRQLLRVWGSQDT 

HDFYL SECNRQ VGNIPGRL V AEMEVGTEQTDRR 

WRSPAQGHLPSVAHLEDFQGLTIPQGSSLVLQGN 

SKRIJLWTPK1MIAALDYDPGDGQMGGQGKGRL 

ALRAGDWMVYXGPMDDQGFYYGELGGHRGVL 
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i S£Q£D 
NO: 


Method 


I Predicted 
1 beginning 

nucleotide 

location 
I corresponding 

to first amino 

acid residue of 
1 peptide 
1 sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D»Aspartie Acid, 
E=Glutamk Add, ^Phenylalanine, OGIycinc, H-Histidlne, 
I=IsoIeucine, K«*Lysine, LHLeudne, M-Methionine, 
N»Asparagine, P=Pro!ine, Q=GlutamJne, R*°Arginine, S=Serine» 
T=Tbreonine, V«Vallne, W«Tryptophan, Y-Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
V=possiblc nucleotide insertion 










VPAMLRIKMSSQGH 


3466 


A 


1 


1111 


MSKPPDLLLRLLRGAPRQRVCTLFIIGFKFTFFVSI 

MTY^VHVVGEPKEKGQLYNLPAEIPCPTLTPPTPP 

SHGPTPGN1FFLETSDRTNPNFLFMCSVESAARTH 

PESHVLVLMKGLPGGNASLPRHLGISLLSCFPNV 

QMLPLDLRELFRDTPLADWYAAVQGRWEPYLL 

PVLSDASRJALMWKPGGIYLDTDnVLKNLRNLT 

NVLGTQSRYVLNGAFLAFERRJHEFMALCMRDFV 

DHYNGWIWGHQGPQLLTRVFKKWCSIRSLAESR 

ACRGVTTLPPEAFYPIPWQDWKKYFEDINPEELP 

RLLSATYAVHVWNKKSQGTRFEATSRALLAQLH 

ARYCPTTHE/DHENVLVKGPAGHLPNLLLMGHW 


3467 


A 


1 


2175 


MAKVILKQSKQCKNLLTCKVAQVCPVCGCLHC 

YFWWLSGLESRRPSSPLDDIKPIEFGVLSAKKEPIQ 

PSVLRRTYNPDDYFRKFEPHLYSLDSNSDDVDSL 

TDEEILSKYQLGMLHFSTQYDLLHNHLTVRVIEA - 

RDLPPPISHDGSRQDMAHSNPYVKICLLPDQKNS 

KQTGVKRKTQKPVFEERYTFEIPFLEAQRRTLLL 

TWDFDKFSRHCVIGKVSVPLCEVDLVKGGHW 

WKAHDSQFSAPGLPADQQFFADLFSGLVLNPQL 

LGRVWFASQPASLPVGSLCIDFPRLDIVLRGEYG 

NLLEAKQQRLV^GEMLFIPARAANLPVNNKPVM 

LLSLVFAPTWLGLSFYDSRTTSLLHPARQIQLPXSL 

QRGEGEAMLSXALTLFSRSPLEQNIIQPLVLSLLHL 

CGSVVNMPPGNSQPRGDFLYHSICTWVQDNYAQ 

PLTRESVAQFFNITPNHLSKLFAQHGTMRFIEYVR 

WVRMAKARMILQKYHLSIHEVAQRCGFPDSDYF 

CRVFRRQFGMDYVDILQIHRWDYNTPIEETLEAL 

ND WKAGKARYIGASSMHA SQF AQALELQKQH 

GWAQFVSMQDHYNLIYREEEREMLPLCYQEGV 

AVIPWSPLARGRLTRPWGETTARLVSDEVGKNL 

YKESDENDAQIAERLTGVSEELGATRAQVALAW 

LLSKPGIAAPnGTSREEQLDELLNAVDITLKPEQI 

AELETPYKPHPVVGFK 


3468 


A 


147 


3209 


ALPLPLPTLYPGMSRRKQRKPQQLISDCEGPSASE 

NGDASEEDHPQVCAKCCAQFTDPlKt'LAHQNAC 

STDPPVMVnGGQENPNNSSASSEPRPEGHNNPQ 

VMDTEHSNPPDSG SS VPTDPTWGPERRGEESSGH 

FLVAATGTAAGGGGGLILASPKLGATPLPPESTP 

APPPPPPPPPPPGVGSGHLNIPLILEELRVLQQRQI 

HQMQMTEQICRQVLLLGSLGQTVGAPASPSELP 

GTGTASSTKPLLPLFSPIKPVQTSKTLASSSSSSSS 

SSGAETPKQAFFHLYHPLGSQHPFSAGGVGRSHK . 

PTPAPSPALPGSTDQLIASPHLAFPSTTGLLAAQC 

LGAARGLEATASPGLLKPKNGSGELSYGEVMGP 

LEKPGGRHKCRFCAKVFGSDSALQIHLRSHTGER 

PYKCNVCGNRFTTRGNLKVHFHRHREKYPHVQ 

MNPHPVPEHLDYVITSSGLPYGMSVPPEKAEEEA 

ATPGGGVERKPLVASTTALSATESLTLLSTSAGT 

ATAPGLPAFNKFVLMKAVEPKNKADENTPPGSE 

GSAISGVAESSTATRMQLSKLVTSLPSWALLTNH 

FKSTGSFPLPLCARALGXASPSETSKLQQLVEKJD 

RQGA VA VTSAA SGAPTTSAPAPSSSASSGPNQC V 

ICLRVLSCPRALRLHYGQHGGERPFKCKVCGRAF 

STRGNLRAHFVGHKASPAARAQNSCPICQKKFT 
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SEQ ID 
NO: 


Method 


1 Predicted 
1 beginning 

nucleotide 

location 
1 corresponding 
I to first amino 

acid residue of 
1 peptide 
I sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCystelne, D=Aspartic Acid, 
E-Glutamtc Add, ^Phenylalanine, (^Glycine, H^Histidine, 
l=>Isoteucine, K-Lyslne, L»Leucine, M-Methlooinc, 
N-Asporagine, P=Prollne, Q=Glutamine, R=Arginine, S=Serine, 
T-Ttareoninc, V»Valiue, W~Tryptophan, Y=Tyrosine, 
X»Unknown, *=»Stop codon, ^possible nucleotide deletion, 
\=possibIe nucleotide insertion 










NAVTLQQHVRMHLGGQIPNGGTALPEGGGAAQ 

ENGSEQSTVSGAGSFPQQQSQQPSPEEELSEEEEE 

EDEEEEEDVTDEDSLAGRGSESGGEKAISVRGDS 

EEASGAEEEVGTVAAAATAGKEMDSNEKTTQQS 

SLPPPPPPDSLDQPQPMEQGSSGVLGGKEEGGKP 

ERSSSPASALTPEGEATSVTLVEELSLQEAMRKEP 

GESSSRKACEVCGQAFPSQAAL\EEH\QKTHPKEG 

PLFVTCWCRQGFLERATLKKHMLLAHHQVQPFA 

PHGPQNIAALSLVPGCSPSITSTGLSPFPRKDDPTI 

P 


3469 


A 

1 | 


3 

* It 


5664 


NLRPLSF ALFLGDPNMANLEESFPRGG TRKIHKP 

EKAFQQSVEQDNLFDISTEEGSTKRKKSQKGPAK 

TKKLKIEKRESSKSAREKFEILSVESLCEGN1RILG 

CVKEVNELELVISLPNGLQGFVQVTEICDAYTKK 

LNEQVTQEQPLKDLLHLPELFSPGMLVRCWSSL 

GITDRGKKSVKI^LNPKhTVNRVLSAEALKPGML 

LTGTVSSLEDHGYLVDIGVDGTRAFLPLLKAQEY 

IRQKNKGAKLKVGQYLNCIVEKVKGNGGVVSLS 

VGHSEVSTAIATEQQSWNLNNLLPGLVVKAQVQ 

KVTPFGLTLNFLTFFTGVVDFMHLDPKKAGTYFS 

NQAVRACILCVHPRTRWHLSLRP1FLQPGRPLTR 

LSCQNLGAVLDDVPVQGFFKKAGATFRLKDGVL 

AYARl^HLSDSKNVFNPEAFKPGNTHKCRIIDYS 

QMDELALLSLRTSHEAQYLRYHDIEPGAVVKGT 

VLTIKSYGMLVKVGEQMRGLVPPMHLADILMK 

OTEKXYrflGDEVKCRVLLCDPEAKKLMMTLKKT 

LIESKJLPVITCYADAKPGLQTHGF11RVKDYGCIV 

KFYNNVQGLVPKHELSTEYBPDPERVFYTGQW 

KWVLNCEPSKERMLLSFKLSSDPEPKKEPAGHS 

QKKGKAINIGQLVDVKVLEKTKDGLEVAVLPHN 

IRAFLPTSHLSDHVANGPLLHHWLQAGDILHRVL 

CLSQSEGRVLLCRKPALVSTVEGGQDPKNFSEIH 

PGMLLIGFVKSKDYGVFIQLPSGLSGLAPKAIMS 

DKFVTSTSDHFVEGQTVAAKVTNVDEEKQRMLL 

SLRLSDCGLGDLAITSLLLLNQCLEELQGVRSLM 

SNRDSVLIQTLAEMTPGMFLDLVVQEVLEDGSV 

VFSGGPVPDLVLKASRYHRAGQEVESGQKKKW 

ILNVDIJLKLEVHVSLHQ\DLV\NRKARKLRKGSE 

HQAIVQHLEKSFAIASLVETGHLAAFSLTSHLND 

TFRFDSEKLQVGQGVSLTLKTTEPGVTGLLLAVE 

GPAAKRTMRPTQKDSETVDEDEEVDPALTVGTI 

KKHTL^IGDMVTGTVKSIKPTHVVVTLEDGIIGCI 

HASHILDDWEGTSPTTKLKVGKTVTARVIGGRD 

MKTFKYLPISHPRFVRTTPELSVRPSELEDGHTAL 

NTHSVSPMEKIKQYQAGQTVTCFLKKYNVVKK 

WIJEVEIAPDIRGRIPLLLTSLSFKVLKHPDKKFRV 

GQALRATWGPDSSKTFLCLSLTGPHKLEEGEVA 

MGRVVKVTPNEGLTVSFPFGKIGTVSIFHMSDSY 

SETPLEDFVPQKWRCYILSTADNVLTLSLRSSRT 

NPETKSKVEDPEINSIQDKEGQLLRGYVGSIQPH 

GVFFRLGPSWGLARYSHVSQHSPSKKALYNKH 

LPEGKLLTARVLRLNHQKNL VELSFLPG DTGKPD 

VLSASLEGQLTKQEERKTEAEERDQKGEKKNQK 

RNEKKNQKGQEEVEMPSKEKQQPQKPQAQKRG 

GRECRESGSEQERVSKKPKKAGLSEEDDSLVDV 
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~SEQ ID 
NO: 


| Method 


"Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 
j sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A»AIanine OCysttine, D=Aspartlc Add, 
£>=GlutamIc Add, F=Phenyl alanine, OGtydne, H»Histidine, 
I=IsaJcudne, K=Lysine, L=Leudne, M=Metbionine, 
N-Asparogine, P=Prolinc, Q=Glutamine, R=Arginine» S=Scrine, 
T-Threonine, V-Valine, W-Tryptophan, Y«Tyrosine, 
X»Unknown, *=Stop codon, /^possible nudeotide deletion, 
\-possible nucleotide insertion 








• 


YYREGKEEAEETNVLPKEKQTKPAEAPRLQLSSG 

FAWNVGLDSLTPALPPLAESSDSEEDEKPHQATI 

KKSKKERELEKQKAEKELSRTEEALMDPGRQPE 

SADDFDRLVLSSPNSSELWLQYMAFHLQATEDSK 

ARAVAERALKTISFREEQEKLNVWVALLNLENM 

YGSQESLTKVFERAVQYNEPLKVFLHLADIYAKS 

EKFQEAGELYNRMLKRFRQEKAVWDCYGAFLLR 

RSQAAASHRVLQRALECLPSKEHVDVIAKFAQL 

EFQLGDAERAKAIFENTLSTYPKRTDVWSVYID 

MTIKHG SQKDVRDIFERVIHLSLAPKRMKFFFKR 

YLDYEKQHGTEKDVQAVKAKALEYVEAKSSVL 

ED 


3470 


A 


1 2334 

■ "T 
■ 


1226 

i » - 


TAAAPVAPGTMDDATVLRKKG YJVGINL GKGS Y 

AKVKSAYSERLKJPNVAVKIIARKK1F11)FVERFL 

PREMDIL ATVNHG SDKTYEIFETSDGRIYIIMELG 

VQGDLLEFDCCQGALHEDVARKMFRQLSSAVKY 

CHDLDIVHRDLKCENLLLDKDFNKLSDFGFSKR 

CLRDSNGRIILSKTFCGSAAYAAPEVLQSIPYQPK 

VYDIWSLGVTLYIMVCGSMPYDDSDIRKMLRIQK 

EHRVDFPRSKM^TCECKDLIYRMLQ\PDVS\KRLH 

EDEILSHSWLQPPKPK\ATSSASFKREGEGKYRAE 

CKXDTKTGLRPDHRPDHKLGAKTQHRLLVVPEN 

ENRMEDRLAETSRAKDHfflSGAEVGKAST 


3471 

1 * p 


A 


537 


148 


TERGAPQHPTLPLPSLTPSSVHTGQPKTTPSVILFL 
PSCEEPQANKATLVCLN1NN/FYPGILMVTWKAD 
GTLITQS VEKTTPSKQSNNK YVASS YLSLTPEQW 
RSRRSYSCQVMQEGSTVEKSVAPAECS 


3472 


A 


1 


2272 

* 


DKPTRHKTYLSSSWAKMAAAEGPVGDGELWQT 

WLPNH WFLRLREGLKNQ SPTEAEKPASS SLPS S 

PPPQLLTONVVFGLGGELFLWDGEDSSFLVVRLR 

GPSGGGEEPALSQYQRLLCINPPLFEIYQVLLSPT 

QHHVALIGIKGLMVLELPKRWGKNSEFEGGKST 

VNCSTTPVAERFFTSSTSLTLKHAAWYPSEILbPH 

WLLTSDNVIRIYSLREPQTPTNVIILSEAEEESLV 

LNKGRAYTASLGETAVAFDFGPLAAVPKTLFGQ 

NGKDEWAYPLYELYENGE1PLTYISLLHSPGN/I 

WKAVGSIAHASVAAEDNYGYDACAVLCLPCVPN 

ILVIATESGMLYHCWLEGEEEDDHTSEKSWDSR 

IDLIPSLYVFECVELELALKLASGEDDPFDSDFSC 

PVXLHRDPKCPSRYHCTHEAGVHSVGLTWIHKL 

HKFLGSDEEDKDSLQELSTEQKCFVEHILCTKPLP 

CRQPAPIRGFWIVPDILGPTMICITSTYECLIWPLL 

STVHPASPPLLCTREDVEVAESPLRVLAETPDSFE 

KHIRSILQRSVANPAFLKASEKDIAPPPEECLQLLS 

RATQVFREQ YILKQDLAKEEIQRRVKLLCD QKK 

KQLEDLSYCREERKSLRBMAERLADKYEEAKEK 

QEDIMNRMKKIXHSFHSELPVLSDSERDMKKEL 

QLIPDQLRHLGNAIKQVTMKKDYQQQKMEKVL 

SI^KPTHLSAYQRKCIQSIIJKEEGEHIREMVKQIN 

D1RNHVNF 


3473 


A. 


1 


2272 


DKPTRHKTYLSSSWAKMAAAEGPVGDGELWQT 

WLFNHVVFLRLREGLKNQSPTEAEKPASSSLPSS 

PPPQLLTRNWFGLGGELFLWDGEDSSFLWRLR 

GPSGGGEEPALSQYQRLLCINPPLFErYQVLLSPT 

QHHVALIGIKGLMVLELPKRWGKNSEFEGGKST 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«=Alanine CXTysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=Phenyla!anine, OGIycine, H-Histldlne, 
I-Isolcudne, K=Lysi nc, L^Leucine, M~Melhionine, 
N-Asparnginc, P*Proline, Q=>Glutaminc, R^Arginine, S=>Serine, 
Threonine, V=»Vallne, W^Tryptopban, Y«Tyrosine, 
X-Un known, *=Stop codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion 


• 








VNCST1PVAERFFTSSTSLTLKHAAWYPSEILDPH 

VVLLTSDNVDUYSLREPQTPTNVIILSEAEEESLV 

LNKGRAYTASLGETAVAFDFGPLAAVPKTLFGQ 

NGKDEWAYPLYILYENGETFLTYISLLHSPGN/I 

WKA VGSI AHA SVAAEDNYG YDACA VLCLPC VPN 

ILVIATESGMLYHCVVLEGEEEDDHTSEKSWDSR 

IDLIPSLYVFECVELELALKLASGEDDPFDSDFSC 

PVKLHRDPKCPSRYHCnTIEAGVHSVGLTWIHKL 

HKFLGSDEEDKJDSLQEI^TEQKCFVEHILCTKPLP 

CRQPAPIRGFW1VPDILGPTMICITSTYECLIWPLL j 

STVHPASPPLLCTREDVEVAESPLRVLAETPDSFE 

KHIRSILQRSVANPAFLKASEKD1APPPEECLQLLS 

RATQVFREQYELKQDLAKEEIQRRVKLLCDQKK 

KQLEDLSYCREERKSLREMAERLADKYEEAKEK 

QEDIMhflFlMKKLLHSFHSELPVLSDSERDMKKEL 

QLIPDQLRHLGNAIKQVTMKKDYQQQKMEKVL 

SLPKPTIILSAYQRKCIQSILKEEGEHIREMVKQIN 

DIRNHVNF 


3474 


A 


4344 


2550 


DRRREPERHVRVKQRTSVLNMLRRLDKIRFRGH 

KRDDFLDLAESPNASDTECSDEBPLKVPRTSPRDS 

EELRDPAGPGTLIMATGVQDFNRTEFDRLNEIKG 

HLEIALLEKHFLQEELRKLREETNAEMLRQELX)R 

ERQRRMELEQKVQEVLKARTEEQMAQQPPKGQ 

AQASNGAERRSQGLSSRLQKWFYERFGEYVEDF 

RFQPEENTVETEEPLSARRLTENMRRLKRGAKPV 

TNFVK^LSALSDWYSVYTSAIAFTVYMNAVWH 

GWAIPLFLFLAILRLSLNYLIARGWRIQWSIVPEV 

SEPVEPPKEDLTVSEKFQLVLDVAQKAQNLFGK 

MADILEKIKNLFMWVQPEITQKL YVAL WAAFLA 

SCFFPYRLVGLAVGLYAGIKFFUDFIFKRCPRLR 

AKYDTP YII WRSLPTDPQLKERSS A A VSRRLQTTS 

SRSYVPSAPAGLGKEEDAGRFHSTKKGNFHEIFN 

LTCNERPLAVCENGWRCCLINRDRKMPTDYIRN 

GVLYVTVE^LCraSSKSGSSKIU^VIKLVDITDI 

QKYKVLSVLPGSGMGIAVSTPSTQKPLVFGAMV 

HRDEAFETILSQYIKJTSAAASGGDS 


3475 


A 


2 


1126 


TAARRRQKGAAAAAETHGQAKAKSGWLKPYYF 

ffiLMESRKDITNQEELWKMKPRRNLEHDDYLHK | 

DTGETSMLKRPVLLHLHQTAHADEFDCPSELQH 

TQELFPQWHLPnCIAAIIASLTTLYTLLREVIHPLA 

TSHQQYFYKIPILVrNKVLPMVSITLU\JLVYLPGV 

IAAIVQLHNGTKYXKFPHWLDKWMLTRKQFGL 

LSFrTAVLHAIYSLSYPMRRSYRYKLLNWAYQQ 

VQQNKEDALVIEHDVWRMEIYVSLGIVGLAILAL 

LAVTSBPSVSDSLTWREFHYIQSKLGIVSLLLGTIH 

ALIFAWNKWIDIKQFVWYTPrrn 7 MIAVFLPIVVLI 

FKSIIJLPCIJUCKILKIRHGWEDVTKINKTEICSQL 


3476 


A 


143 


3191 


AKAPPTGESSEPEAKVLHTKRLYRAVVEAVHRL 

DLlLCh^TAYQEWKPENISLRNKLRELCVKLMF 

LHPVDYGRKAEELLWRKVYYEVIQLIKTNKKHI 

HSRSTLECAYRTHLVAGIGFYQHLLLYIQSHYQL 

ELQCCIDWTHVTDPLIGCKKPVSASGKEMDWAQ 

MACHRCLVYLGDLSRYQNELAGVDTELLAERFY 

YQALSVAPQIGMPFNQLGTLAGSKYYNVEAMY 

CYLRCIQSEVSFEGAYGNUCRLYDKAAKMYHQL 
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seQ Id 

NO: 


Method 


Predicted 
begin Ding 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
' peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, D^Aspartic Acid, 
EsGlutamlc Add, ^Phenylalanine, G=Clydne, H=Histidine, 
]»lsoleucioc, K«Lyslne, L»Leudne, M=Methionlne, 
N°Asparagine, P=Pro)ine, Q=Glu famine, R-Argininc, S-Serine, 
T^Threonine, V=Valine, W-Tryptophan, Y-Tyrosine, 
X=Unknown, *-Stop cod on, /—possible nudeotide deletion, 
^possible nudeotide insertion 








■ 


KKCETRKLSPGKKRCKDIKRLLVNFMYLQSLLQ 
PKSSSVDSELTSLCQSVLEDFNLCLFYLPSSPNLS 
LASEDEEEYESGYAFLPDLLIFQMVIICLMCVHSL 
ERAGSKQYSAAIAFTIj^LFSHLVNHVNIRLQAEL 

eegenpvpafqsdgtdepeskepvekeeepdpepp 

pvtpqvgegrksrkfsrlsclrrrrhppkvgdds 

dlsegfesdsshdsarasegsdsgsdkslegggt 

afdaetdsemnsqesrsdledmeeeegtrsptle 

pprgrseapdslngplgpseasiasnlqamstqm 

fqtkrcfrlaptfsnlllqpttnphtsashrpcv 

ngdvdkpsepaseegsesegsessgrscrnersiq 

BKLOVLMAEGLLPAVKVFLDWLRTNPDLnVCA 

QSSQSLWNRLSVLLNLLPAAGELQESGLALCPBV 

QDLLEGCELPDLPSSLLLPEDMALRNLPPLRAAH 

RRFNFDTDRPLLSTLEESVVRICCIRSFGHFIARLQ 

GSILQFNPEVGIFVSIAQSEQESLLQQAQAQFRMA 

QEEARR^JRLMRDMAQLRLQLEVSQLEGSLQQPK 

AQSAMSPYLVPDTXJALCHHLPVIRQLATSGRFIVI 

IPRTVIDGLDLLKKEHPGARDGIRYLEAEFKKGN 

RYIRCQKEVGKSFERHKLKRQDADAWTLYKILD 

SCKQLT\LAQGAGEEDPSGMVTIITGLPLDNPSVL 

SGPMQAALQAAAHASVDIKNVLDFYKQWKEIG 


3477 


A 


1 

* 


3902 


MTEPRERRGYSVPPRPEVGTQATEWRVEESNFN 

KIFLKKDAELGRSNHLPTWDKPEDASWLPQSCL 

GGDAVATTGEIHEEKAWKTOALEVGQPAQRDIR 

RGELWGKEHGADQAIQETLEDLSSLERTLVVSES 

SPLGGDCQEVTTLTVKYQVSEEVPSGTVIGKLSQ 

ELGREERRRQAGAAFQVLQLPQALP1QVDSEEGL 

LSTGRRLDREQLCRQWDPCLVSFDVLATGDLALI 

HVEIQVLDINDHQPRFPKGEQELEISESASLRTRIP 

LDRALDPDTGPNTLHTYTLSPSEHFALDVIVGPD 

ETKHAELrVVKELDREIHSFFDLVLTAYDNGNPP 

KSGTSLVKVN VLDSNDNSPAFAES SLALEIQEDA 

APGTLLIKLTATDPDQGPNGEVEFFLSKHMPPEW 

LDTFSIDAKTGQVDLRRPLDYEKNPAYEVDVQAR 

DLGPNPIPAHCKVLIKVLD\^mPSIHVTW'ASQP 

SLVSEALPKDSFIALVMADDLDSGNNGLVHCWL 

SQELGHFRLKRTNGNTYMLLTNATLDREQWPK 

YTLTLLAQDQGLQPLSAKKQLSIQISDINDNAPVF 

EK5RYEVSTRENNLPSLHLITEKAHDADLGINGK 

VSYRIQDSPVAHLVAIDSNTGEVTAQRSLNYEEM 

AGFEFQVIAEDSGQPMLASSVSVWVSLLDANDN 

APEWQPVLSDGKA SLS VLVNASTGHLLVPEETP 

NGLGPAGTDTPPLATHSSRPFLL'ri 1 VARDADSG 

ANGEPLYSIRSGNEAHLFILOTHTGQIJVNVTOA 

SSLIGSEWELEIVVEDQGSPPLQTRALLRVMFVTS 

VDHLRDSARKPGALSMSMLTV1CLAVLLGIFGLI 

LALFMSICRTEKKDNRAYNCREAESTYRQQPKR 

PQKfflQKADIHLVPVLRGQAGEPCEVGQSHKDV 

DKEAMMEAGWDPCLQAPFHLTPTLYRTLKNQG 

NQGAPAESREVLQDTVN1XFNHPRQRNASRENL 

NLPEPQPATGQPRSRPLKVAGSPTGRLAGDQGSE 

EAPQRPPASSATLRRQRHLNGKVSPEKESGPRQI 

LRSLVRLSVAAFAERNPVEELTVDSPPVQQISQLL 

SLLHQG QFQPKPNHRGNKYLAKPG GSRS AIPDTD 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


1 Predicted end 

nucleotide 

location 

corresponding 

to Inst amino 
. add residue of 

peptide 

sequence 


Amino acid sequence (A^Alnnine OCysteine, D=Asportic Add, 
E-Glutamic Add, ^Phenylalanine, G-Glycint, H=Histidinc, 
I-Isoleudne, K«Lysine, L^Leucine, M«Methioninc, 
N=Aaparaginc, P^Proline, Q*=Glutaraine, R~Arginine, S=Serine, 
T«ThreonJne, V-Vnllne, W=0*ryptophan, Y^Tyrosine, 
X«=Unknovrn, *=Stop codon, ^possible nucleotide deletion, 
V= possible nudeotide insertion 

■ 










GPSARAGGQTDPEQEEGPLDPEEDLSVKQLLEEE 

LSSLLDPSTGLALDM.SAPDPAWMARLSLPLTTN 

YRDNVISPDAAATEEPRTFQTFGKAEAPELSPTG 

TRLASTFVSEMS SLLEMLLEQRSSMPVEA ASEAL 

RRLSVCGRTLSLDLATSAASGMKVQGDPGGKTG 

TEGKSRGSSSSSRCL 


3478 


A 


13 


1620 


TLPPPGNSGCHRLCFPEFEFLQVTKMEFSGRKWR 

KLRLAGDQRNASYPHCLQFYLQPPSENISLIEFEN 

LAIDRVKLLKSVENLGVSYVKGTEQYQSKLESEL 

RKLKFSYRENLEDEYEPRRRDHISHFILRLAYCQS 

EELRRWFIQQEMDLLRFRF StLPKDKIQDFLKDS Q 

LQFEAISDEEKTLREOEIVASSPSLSGLKLGFESIY 

KIPFADALDLFRGRKVYLEDGFAYVPLKDIVAUL 

NEFRAKLSKALALTARSLPAVQSDERLQPLLNHL 

SHSYTGQDYSTQGNVGKISLDQIDLLSTKSFPPC 

MRQLHKALRENHHLRHGGRMQYGLFLKGIGLT 

LEQALQFWKQEFIKGKMDPDKFDKGYSYNIRHS 

FGKEGKRTDYTPFSCLKIILSNPPSQGDYHGCPFR 

HSDPELLKQKLQSYKISPGGISQ1LDLVKGTHYQ 

VNACQKYFEMIHTVDDCGFSVLSHPNQYFCESQRI 

LNGGKDIKKEPIQPETPQPKPSVQKTKDASSALA 

SLNSSLEMDMEGLEDYFSEDS 


3479 


A 


698 


138 


RPELELWRLRSRSWRPLGVPRRCHRKNWKEPVR 

AQPLSVTVWAPRCQRP/QPPAPEPSSPNAAVPEAI 

PTPRAAASAALELPLGPAPVSVAPQAEAEARSTP 

GPAGSRLGPETFRQRFRQFRYQDAAGPREAFRQL 

REL/SPRQWLRPDI\RTKEQ\IVEMLVQEQLLAILP 

EAARARRIRRRTDVRITG 


3480 


A 


117 


2226 


RRGSRSRGPFAEPAAPGGLCSSSEEKTEEGGMAV 

GLCKAMSQGLVTFRDVALDFSQEEWEWLKPSQ 

KDLYRDVMLEhTyRNLVWLGLSISKPNMISLLEQ 

GKEPWMVERKMSQGHCADWESWWE1EELSPK 

WFIDEDEISQEMVMERLA SHGLECSSFRE A WKY 

KGEFELHQGNAERHFMQVTAVKEISTGKRDNEF 

SNAWEKHTPEISIITnITTESVPTIQQVHKFDIYDKJLF 

PQNSVIIEYKRLriAEKESLIGNECEEFNQSTYLSK 

DIGIPPGEKPYESHDFSKLLSFHSLFTQHQTTHFG 

KLPHGYDECGDAFSCYSFFTQPQRIHSGEKPYAC 

NDCGKAFSHDFFLSEHQRTHIGEKPYECKECNKA 

FRQ S AHLAQHQRIHTGEKPF A CNECGKAF SRY AF 

LVEHQRMTX3EKPYECKECNKAFRQSAHLNQHQ 

RIHTGEKPYECNQCGKAFSRRIALTLHQRIHTGE 

KPFKCSECGKTFGYRSHLNQHQRIHTGEKPYECI 

KCGKFFRTDSQLNRHHRIHTGERPFECSKCGKAF 

SDALVLIHHKRSHAGEKPYECNKCGKAFSCGSY 

LNQHQRIHTGEKPYECSECGKAFHQILSLRLHQRI 

HAGEKPYKCNESQRVRRSELAVSRGLTTKPADT 

GPDSTLNAAKVAEPARAGTEAALRPALSVAESA 

TSLGPLHQGRRFPEAPAAHPGGTGFTVCAS 


3481 


A 


2 


1522 


ASRHGMTPGALLMLLGALGPPLAPGVRGSEAEG 

RLREKLFSGYDSSVRPAREVGDRVRVSVGLILAQ 

USU^EKDEEMSTKVYLDLEWTOYRLSWDPAEH 

DGIDSLRITAESVWIJDVVLLWJNDGNFDVALDI 

SVWSSDGSVRWQPPGIYRSSCSIQVTYFPFDWQ 

NCTMVFSSYSYDSSEVSLQTGLGPDGQGHQEIHI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residoe of 
peptide 
sequence 


Amino add sequence (A B Alaoine C=Cysteine, D^Aspartic Acid, 
E=Clutamic Add, F=PhenyIa!anlne, 0=Clydne f H»Histidine y 
I=*Isoleudne, K=Lyslne, LHLeucine, M=Methionlne, 
N»Asparaglne, P=»Pro!ine, Q^Glutamine, R=Arginine t S=Serine, 
T^Threonine, V^Valine, W=Tryptophan, V^Tyrosine, 
X=Unknowo, *=Stop codon, /^possible nucleotide deletion, 
\=possible nudeotide insertion 








• 


HEGTFIENGQWENIHKPSRLIQPPGDPRGGREGQ 

RQEVIFYLIIRRKPLrm-\nWIAPCILITIXAIFV^ 

LPPDAGEKMGl^lFALLTLTVFLLLLADKVPETSL 

SVPIIIKYLMFTMVL\rn^VILSVVVLNLHHRSPH 

THQMPLWVRQIFIHKLPLYLRLKRPKPERDLMPE 

PPHCSSPGSG WGRGTDE* F1RKPPSDFLFPKFNRF 

QPELSAPDLRRFIDGPNRAVALLPELREWSSISYI 

ARQLQEQEDHDALKEDWQFVAMVVDRLFLWTF 

IIFTSVGTLWIFLDATYHLPPPDPFP 


3482 


A 


1273 


172 


ERWDSGGADAEWYALADWTAVWLPRSDFYTR 

SDPALPWTLGHGNQPPAWPEPQGPMGPAGVAA 

RPGRFFGVYLLY(^NPRYRVR\VYVGFTVOTARR 

VQQHNGGRKKGGA\GRTSGRGPWEMVLVVHGF 

PSS V AALRFE WA WQHPHASRRLAHVGPRLRGET 

AF AFHLRVLAHMLRAPP WARJLPLTLR WV RPDLR 

QDLCLPPPPHVLLAFGPPPAQVPRPQRRRAGPFD 

DAEPEPDQGDPGACCSLCAQTIQDEEGPLCCPHP 

GCLLRAHVICLAEEFLQEEPGQLLPLEGQCPCCE 

KSLLWGDLIWLCQMDTEKEVEDSELEEAHWTD 

LLET 


3483 


A 

< 


230 


3686 

• 


WRPWPCIDTSWNLQVAARTLRVSSAQCGLVPT 

MARVESPVPAARASLTGSCVLGQAMPLRGGAGP 

SPASHGPTHGPSDPRTCLPGRGAGGMRPHGRGA 

LGCCGLCSFYTCHGAAGDEIMHQDIVPLCAADIQ 

DQLKKRFAYLSGGRGQDGSPVITFPDYPAFSEIPD 

KEFQNVMTYLTSIPSLQDAGIGFILVIDRRRDKW 

TSVKASVLRIAASFPANLQLVLVLRPTGFFQRTLS 

DIAFKFNRDDFKMKVPVIMLSSVPDLHGYIDKSQ 

LTEDLGGTLDYCHSRWLCQRTAJESFALMVKQT 

AQMLQSFGTELAETELPNDVQSTVSSVLCAHTEK 

KDKAKEDLRLALKEGHSVLESLRELQAEGSEPSV 

NQDQLDNQATVQRLLAQLNETEAAFDEFWAKH 

QQKLEQCLQLRHreQGFREVKAILDAASQKIATF 

TDIGNSLAHVEHLLRDLANFQEKSGVFVERARA 

LSLTASSFIGNKHYAVDSIRPKCQELRHLCDQFSA 

EIARRRGLLSKSLELHRRLETSMKWCDEGIYLLA 

SQPVDKCQSQDGAEAALQEIEKFLETGAENKIQE 

LNAJYKEYESILNQDLMEHVRKVFQKQASMEEV 

FHRRQASLKKLAARQTRPVQPVAPRPEALAKSP 

CPSPGIRRGSENSSSEGGALRRGPYRRAKSEMSES 

RQGRGSAGEEEESLAILRRHVMSELIJDIERAYVE 

BLLCVLEGYAAEMDNPLMAHLI^TGLHNKKDV 

LFGNMEEIYHFHNRIFLRELENYTDCPELVGRCF 

LERMEDFQIYEKYCQNKPRSESLWRQCSDCPFFQ 

ECQRKLDHKLSLDSYLLKPVQRITKYQLLLKEM 

LKYSRNCEGAEDLQEALSSELGILKAVNDSMHLI 

AITGYDCn^OLGDLGKLLMQGSFSVWTDHKRGHT 

K VKEL ARFKPMQRHLFLJIEKA VLFCKKREENG E 

GYEKAPSYSYKQSLNMAAVGITENVKGDAKKFE 

IWYNAREEVYIVQAPTPEIKAAWVNEIRKVLTSQ 

LQACREASQHRALEQSQSLPLPAPTSTSPSRGNSR 

NIKKLEERKTDPLSLEGYVSSAPLTKPPEKGKGW 

SKTSHSLEAPEDDGGWSSAEEQINSSDAEEDGGL 

GPKKLVPGKYTWADHEKGGPDALRVRSGDW 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AJanlne O-Cysteine, D=Aspartic Add, 1 
£=Glutamic Add, F=Pheny (alanine, G=Glydne, H»Histidine, 
I=Isokuclne, K=Lysine, L^Lendne, M=Methionine, 
N^Asparagine, P=ProUne, Q=Glutaminc, R=Arginine, S-Serine, 1 
T°Threonine, V=Voline, W=Tryptopban, Y=Tyrosinc, j 
X^Uuknown, *=Stop codon, /^possible nucleotide ddetion, 
V=possible nucleotide insertion 










ELVQEGDEGLW | 


3484 

* 


A 

• 


208 


6103 

» 


VTMAQQAADKYLYVDKNFINNPLAQADWAAK 

KLVWVPSDKSGFEPASLKEEVGEEATVELVENGK 

KVKVNKDDIQKMNPPKJFSKVEDMAELTCLKEAS 

VLHNLKERYYSGLIYTYSGLFCVVINPYKNLPIYS 

EErVEMYKGKKRHEMPPHIYAITDTAYRSMMQD 

REDQSILCTGESGAGKTENTICKVIQYLAYVASSH 

KSKKDQGELERQLLQANPELEAFGNAKTVKNDN 

SSRFGKFIRINFDVNGYIVGANDETYLLEKSRA1RQ 

AKEERTFfflFYYLLSGAGEHLKTDLLLEPYNKYR 

FLSNGHVTIPGQQDKDMFQETMEAMRIMGIPEEE 

QMGLLRVISGVLQLGNIVFKiCEROTDQASMPDN 

TAAQKVSHLLGINVTDFTRGILTPRIKVGRDYVQ 

KAQTKEQADFAIEALAKATYERMFRWLVLRINK j 

ALDKTKRQGASHGBLDIAGFEIFDLNSFEQLCINY 

TNEKLQQLFNrnMFILEQEEYQREGIEWNFIDFG 

LDLQPCIDLBEKPAGPPGILALLDEECWFPKATDK 

SFVEKVMQEQGTHPKFQKPKQLKDKADFCIIHY ! 

AGKVDYKAJ^EWLMKNMDPLhn^NIATLLHQSSD 

KFVSELWKDVDRIIGLDQVAGMSETALPGAFKT 

RXGMFRTVGQLYKJEQLAKLMATLRNTNPNFVR 1 

CnPNHEKJCAGKXDPHLVLDQLRCNGVLEGIRJCR 

QGFPNRWFQEFRQRYEILTPNSIPKGFMDGKQA 

CVLMIKALELDSNLYRIGQSKWFRAGVLAHLEE 

ERDLKITDVnGFQACCRGYLARKAFAKRQQQLT 

AMKVLQRNCAAYLKLRNWQWWRLFTKVKPLL 

QVSRQEEEMMAKEEELVKVREKQLAAENRLTE 

METLQSQLMAEKLQLQEQLQAETELCAEAEELR 

ARLTAK\KQ\ELEEICHDLEARVEEEEERCQHLQA 

EKKKMQQNIQELEEQLEEEESARQKLQLEKVTT 

EAKLKKLEEEQULEI>QNCKLAKEKKLLEDRIAEF 

TTNLTEEEEKSKSI^KLKNKHEAMITDLEERLRR 

EEKQRQELEKTRRKLEGDSTDLSDQIAELQAQMA 

ELKMQLAKKEEELQAALARVEEEAAQKNMALK 

KIRELESQISELQEDLKCERVASKNKAEKQKRDLG j 

EELEALKTELEDTLDSTAAQQELRSKREQEVNIL \ 

KK71.EEEAKTHEAQIQEMRQKHSQAVEELAEQL 

EQTKRVKANLEKAKQTLENERGELANEVKV1JLQ 

GKGDSEHKRKKVEAQLQELQVKFNEGERVRTEL 

ADKVTKLQVELDNVTGLLSQSDSKSSKLTKDFS 

ALESQLQDTQELXQEENRQKLSLSTKLKQVEDE 

KNSXFREQLEEEEEEAKHNLEKQIATLHAQVADM 

KKKMEDSVGCLETAEEVKRKLQKDLEGLSQRHE 

EKVAAYDKLEKTKTRLQQELDDLLVDLDHQRQ 1 

SACNLEKKQKKFDQLLAEEKTISAKYAEERDRA I 

EAEAREKETKALSLARALEEAMEQKAELERLNK 

QFRTEMEDLMSSKDDVGKSVHELEKSKRAIEQQ 

VEEMKTQLEELEDELQATEDAKLRLEVNLQAM 

KAQFERDLQGRDEQSEEKKKQLVRQVREMEAE J 

LEDERKQRSMAVAARKKLEMDUCDLEAHIDSA 1 

NKNRDEAIKQLRKLQAQMKDCMRELDDTRASR 

EEII^QAKENEKKLKSMEAEMIQLQEELAAAER 

AKRQAQQERDELADEIANSSGKGALALEEKRRL 

EARIAQLEEEL^EEQGKim-INDRLKKANLQIDQI 

NTDLNLERSHAQKNENARQQLERQNKELKVKL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino odd sequence (A=Alanine OCystdne, D=Aspartic Add, 
E=G1utamlc Add, F-Phenyl alanine, G=Glyclne, HHHistidine, 
I=Isoleucine, K»Lysine v L=Leueine, M=Methionine, 
N^Asparagine, P^Prollne, Q=Glutamine, R=Argininc, S=Serine, 
T=Threonine, V=Valine, W«Tryptopbao, Y»Tyrosine, 
X=Unknown, *«Stop codon, /^possible nudeotide deletion, 
^possible nudeotide insertion 










QEMEGTVKSKYKASITALEAKIAQLEEQLDNETK 

ERQAACKQVRRTEKKLKDVLLQVDDERRNAEQ 

YKDQADKASTRLKQLKRQLEEAEEEAQRANASR 

RKLQRELED ATETAD AMNRE VS SLKNKLRRGDL 

PFWPRRMARKGAGDGSDEEVDGKADGAEAKP 

AB 


3485 


A 


2 


1782 


CSTGVSKAPLTYLMSYGFELGWRKGNRAVACR 

EDRGGESVGMGQESILSQVHWWEAEPVEKTPGR 

DSEATIMSLRVHTLPTLLGAVVRPGCRELLCLLM 

ITVTVGPGASGVCPTACICATDIVSCTNKNLSKVP 

GNLFRLIKRLDLSYNRIGLLDSEWIPVSFAKLNTL 

ILRHNNITSISTGSFS'ri'PNLKCLDLSSNKLKTWK 

NAVFQELKVLEVLLLYNNHISYLDPSAFGGLSQL 

QKLYLS GNFLTOFPMDL YVGRFKLAELMFLD VS 

YNRn*SMPMHHINLVPGKQLRGIYLHGNPFVCD\ 

CSLVSLLVFWYRRHFSSVMDFKNDV TCRLWSDS 

RHSRQVLLLQDSFMNCSDSIING SFRALGFIHEAQ 

VGERLMVHCDSKTGN ANTDFIWVGPDNR1 J ,FPD 

KEMENFYVFHNGSLVIESPRFEDAGVYSCIAMNK 

QRLLNEWDVTINVSNFI^SRSHAHEAFNTAFTT 

LAACVASIVLVLLYLYLTPCPCKCKTKRQKNML 

HQSNAHSSILSPGPASDASADERKAGAGKRVVFL 

EPLKDTAAGQNGKVRLFPSEAVIAEGILKSTRGK 

SDSDSVNSVFSDTPFVAST 


3486 


A 


357 


1173 


GDPRETKVFPSRSF ARNTVG V SHHQSHLFHTVSR 

IYVEDKHKJLYCEWKAGCSNWKRILMVLNGLA 

SSAYNISHNAVHYGKHLKKLDSFDLKGIYTRLDT 

YTKXLVLVRDPMERLVSAFRDKFDHPNSYYHPVF 

GKAIDCKYRPNACEEALINGSGVKFKEFIHYLLDS 

HRPVGMD1HWEKVSKLCYPCLINYDFVGKFETL 

EEDANYFLQMIGAPKELKFPNFKDRHSSDERTNA 

QVVRQYLKDLTRTERQUYDFYYLDYLMFNYTT 

PFL 


3487 


A 


2 


3281 


CX>KSGAVPFSTTRSPRRPSPRSAGPSLSSVSPRSQ 

LWASSGLSEEHAAPLLPAWPRHPCPPSLTPGPSM 

AQGAMRFCSEGDCAISPPRCPRRWLPEGPVPQSP 

PASMYGSTGSLLRRVAGPGPRGRELGRVTAPCTP 

LRGPPSPRVAPSPWAPSSPTGQPPPGAQSSWIFR 

FVEKASVRPLNGLPAPGGLSRSWDLGGVSPPRPT 

PAL GPG SNRKLRLEA STSDPLP ARGGS ALPGSRN 

LVHGPPAPPQVGADGLYSSLPNGLGDPPERLATL 

FGGPADTGFLNQGDTWSSPREVSSHAQRIARAK 

WEFFYGSLDPPSSGAKPPEQAPPSPPGVGSRQGS 

GVAVGRAAKLYSETDLDTVPLRCYRETDIDEVLA 

EREEADSAIESQPSSEGPPGTAYPPAPRPGPLPGP 

HPSLGSGNEDEDDDEAGGEEDVDDEVFEASEGA 

RPGSRMPLKSPVPFLPGTSPSADGPDSFSCVFEAI 

LESHRAKGTSYTSLASLEALASPGPTQSPFFTFEL 

PPQPPAPRPDPPAPAPLAPLEPDSGTSSAADGPWT 

QRGEEEEAEARAKLAPGREPPSPCHSEDSLGLGA 

APLGSEPPLSQLVSDSDSELDSTERLALGSTOTLS 

NG QKADLEAAQRL AKRL YRLDGFRKAD VARHL 

GKNNDFSKLVAGEYLKFFVFTGMTLDQALRVFL 

KELALMGETQERERVLAHFSQRYFQCNPEALSSE 

DGAHTLTCALMLLNTDLHGHNIGKRMTCGDFIG 



355 



WO 01/57190 



PCT/US01/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIflnine OCysteine, D-Aspartic Acid, 
E^GlutaraU Acid, F»Phenylalanine, G=Glydne, H=Histidine, 
I»Isoleucine, K-»Lysine, L»Leucine, M«Methioninc, 
N»Asparngine, P=Proiine, Q=Glutamine, R=Arginine, S=Serine, 
T=Tbreonine, V=Valine, W=Tryptopnan, Y^Tyrosine, 
X=Unknown, *«Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










NLEGLNDGGDFPRELLKAL YS SIKNEKLQWAIDE 

EELRRFLSELADPNPKVDCRISGGSGSGSSPFLDLT 

PEPGAAVYKHGALVRKVHADPDCKKTPRGKRG 

WKSFHGILKGMILYLQKEEYKPGKALSETELKN 

AISIHHALATRASVNYSKRPHVFYLRTADWRVFL 

FQAPSLEQMQSWmUNWAAMFSAPPFPAAVSS 

QKKFSRPLLPSAATRLSQEEQVRTHEAKLKAMA 

SELREHRAAQLGKKGRGKEAEEQRQKEAYLEFE 

KSRYSTYAALLRVKLKAG SEELDA VEA ALAQAG 

STEDGLPPSHSSPSLQPKPSSQPRAQRHSSEPRPG 

AGSGRRKP 


3488 


A 


441 


1968 


GTETPHCWGRGTAGLRRELDREERDGPGTATMS 

FPHFGHPYRGAFQFLVASASSSTTCCESTLRSVSY 

VASGSTPAPALCCAPNYDSRLLGSARPELGAALGI 

YGAPYAAAAAAQSYPGYLPYSPEPPSLYGALNP 

QYEFKEAAGSFTSSLAQPGAYYPYERTLGQYQY 

ERYGAVELSGAGRRKNATRETTSTLKAWLNEHR 

KNPYPTKGEKIMLA1ITKMTLTQVSTWFANARRR 

LKKENKMTWAPKNKGGEERKAEGGEEDSLGCL 

TADTKEVTASQEARGLRLSDLEDLEEEEEEEEEA 

EDEEVVATAGDRLTEFRKGAQSLPGPCAAAREG 

RLERRECGLAAPRFSFNDPSG SEEADFLSAETG SP 

RLTMHYPCLEKPRJWSLAHTATASAVEGAPPARP 

RPRSPECRMIPGQPPASARRLSVPRDSACDESSCI 

PKAFGNPKFALQGLPLNCAPCPRRSEPWQCQYP 

SGAEGSGPPAALGVSMQK 1K1 YRPARQLHTLCH 

SSLP 


3489 


A 


718 


2073 

• 


IAAYHKALSYRGHVHANNRGTNNVHFTPPPSPS 

RGILPMNPRNMMNHSQVGQGIGIPSRTNSMSSSG 

LGSPNRSSPSIICMPKQQPSRQPFTVNSMSGFGMN 

RNQAFGMNNSLSSNIFNGTDGSENVTGLDLSDFP 

ALADRNRREGSGNPTPLINPLAGRAPYVGMVTK 

PANEQSQDFSIHNEDFPALPGSS YKDPTSSNDD SK 

SNLNTSGKTTSSTDGPKFPGDKSSTTQNNKQQKK 

GIQVLPDGRVTNIPQGMVTDQFGMIGLLTFIRAA 

ETDPGMVHLALGSDLTTLGLNLNSPENLYPKFAS 

PWASSPCRPQDIDFHVPSEYLTNIHIRDKLFFFFS 

W/TAIKLGRYGEDLLFYLYYMNGGDVLQLLAAV 

ELFNRD WRYHKEERV WITRAPGMEPTMKTNTY 

ERGTYYFFDCLNWRKVAKEFHLEYDKLEERPHL 

PSTFNYNPAQQAF 


3490 


A 


2 


2833 


FVAKMATSQYFDFAQGGGPQYSTQAPTLPLPTV 

GASYTGQPTPGMDPAVNPAFPPAAPAGYGGYQP 

HSGQDFAYGSRPQEPVPTATTMATYQDSYSYGQ 

SAAARSYEDRPYFQSAALQSGRMTAADSGQPGT 

QEACGQPSPHGSHSHAQPPQQAPIVESGQPASTL 

SSGYTYPTATGVQPESSASIVTSYPPPSYNPTCTA 

YTAPSYPNYDASVYSAASPFYPPAQPPPPPGPPQ 

QLPPPPAPAGSGSSPRADSKPPLPSKLPRPKAGPR 

QLQLHYCDICKISCAGPQTYREHLGGQKHRKKE 

AAQKTGVQPNGSPRGVQAQLHCDLCAVSCTGA 

DAYAAHmGSKHQKVFKLHAKLGKPIPTLEPALA 

TESPPGAEAKPTSPTGPSVCASSRPALAKRPVASK 

ALCEGPPEPQAAGCRPQWGKPAQPKLEGPGAPT 

QGGSKEAPAGCSDAQPVGPEYVEEVFSDEGRVL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc C-Cysteine, D=Asparric Add, 
E=Glutaraic Add, ^Phenylalanine, G=Glydne, H=HisHdine, 
I-Isoleudne, K=Lysine, L=Ltudne, M=Methtomne, 
N->Asparaginc, P-Proline, Q=€lutamlne, R^Arginine, S=Serlne, 
T-Threonine, V=Valine, W-Tryptophan, Y«Tyrosinc, 
X=Unknown t *-Stop codon, /^possible nucleotide deletion, 
V= possible nudeotide insertion 










RFHCKLCECSF>fl}LNAKDLHVRGRRHRLQYRKK 

VNPDLPIATEPSSRARKVLEERMRKQRHLAEERL 

EQLRRWHAERRRIJBEEPPQDVPPHAPPDWAQPL 

LMGRPESPASAPLQPGRRPASSDDRHVMCKHATI 

YPTEQELLA VQRA VSHAERALKLVSDTLA FF.DR 

GRREEEGDKRSSVAPQTRVLKGVMRVGILAKGL 

LLRGDR3STVRLALLCSEKPTHSLLRRIAQQLPRQL 

QMVTEDEYEVSSDPEANIVISSCEEPRMQVTISVT 

SPLMREDPSTDPGVEEPQADAGDVLSPKKCLESL 

AALRHARWFQARASGLQPCVTVIRVLRPLCRRV 

PTVWGALPAWAMELLVEKAVSSAAGPLGPGDAV 

RRVLECVATGTLLTDGPGLQDPCERDQTDALEP 

MTLQEREDVTASAQHALRMLAFRQraKVLGMD 

LLPPRHRLGARFRKRQRGPGEGEEGAGEKKRGR 

RGGEGLV 


3491 


A 


2 

* 


1321 


F VGDG ALS GCRRGRAPR VPSMAGSLPPC VVDCG 

TGYTKLGYAGNTEPQFDPSCIAIRESAKVVDQAQ 

RRVLRGVDDLDFFIGDEAIDKPTYATKWPIRHGn . 

EDWDLMERFMEQVVFKYLRAEPEDHYFLMTEP 

PLNTPENREYLAEIMFESFNVPGLYIAVQAVLAL 

AASWTSRQVGERTLTGIVIDSGDGVTHVIPVAEG 

YVIGSCIKHIPIAGRDITYFIQQLLREREVGIPPEQS 

LETAKADCEKYCYICPDIVKEFAKYDVDPRKWIK 

QYTGINA1NQKKFVIDVGYERFLGPEIFFHPEFAN 

PDFMESISDVVDEVIQNCPIDVRRPLYKNVVLSG 

GSTMFRDFGRRLQRDLKRVVDARLRJLSEELSGGV 

RDCPKPVEVQVVTHHMQRYAV\WFGG\SMLASTP 

EFFQVCHTKKDYEE YGPSICRHNP VFG VMS 


3492 


A 


3 


2024 


FNGVALLHLPG AA VIPNTNYMFQDALGGRSRG S | 

REESPAPSRAPASASLWRRLWVEAKMAAHAAA 

AAQ AAAAQ AAHAEAADS WYL ALLGFA EHFRTS 

SPPKJRLCVHCLQAVFPFKPPQRIEARTHLQLGSV 

LYHHTKNSEQARSHLEKAWLISQQIPQFEDVKFE 

AASLLSELYCQENSVDAAKPLLRKAIQI SQQTPY 

WHCRLLFQLAQLHTLEKDLVSACDLLGVGAEY 

ARWGSEYTRALFLLSKGMLLLMERKLQEVHPL 

LTLCGQIVENWQGhnPIQKESLRVFFLVLQVTHYL 

DAGQVKSVKPCLKQLQQCIQTISTLHDDEILPSNP 

ADIJTIWIJPKEHMCVLVYLVTVMHSMQAGYLE 

KAQKYTDKALMQLEKLKM7 .DCSPILSSFQVILLE 

HIIMCRLVTGHKATALQEISQVCQLCQQSPRLFS 

NHAAQLHTLLGLYCVSVNCMDNAEAQFTTALR 

LT^QELWAFIVTNnLASVYIREGNRHQEVV\LYS 

LLERINPDHSFPVSSHCLRAAAFYVRGLFSFFQGR 

YNEAKRF1JRETLKMSNAEDLNRLTACSLVLLGHI 

FYVLGNHRESNNMVVPAMQLASK1PDMSVQLW 

SSALLRDLNKACGNAMDAHEAAQMHQNFSQQL 

LQDHIEACSLPEHNLITWTDGPPPVQFQAQNGPN 

TSLASLL 


3493 


A 


3 


2024 


PNGVALLHLPGAAVIPNTNYMFQDALGGRSRGS~ 

REESPAPSRAPASASLWRRLWVEAKMAAHAAA 

AAQAAAAQAAHAEAADSWYLALLGFAEHFRTS 

SPPKIRLCVHCLQAVFPFKPPQRIEARTHLQLGSV 

LYHHTKNSEQARSHLEKAWLISQQIPQFEDVKFE 

AASLLSELYCQENSVDAAKPLLRKAIQISQQTPY 
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SEQIO 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence <A-AIanine OCysteine, D=Aspartic Acid, 
E-GIutamic Add, F^Phenytalaninc, G=Glycine, H»Histidine, 
I=Iso leucine, K=Lyslne, Leucine, M=Mcthiontne, 
N a Asparagine, F^Proline, Q=Glutamine, R=Arginme, S=Serine, 
T=*Th reonine, V«Vallnc, W«Tryptophan t Y=Tyroslne, 
X«Unknown, *=Stop codon, ^possible nucleotide deletion, 
possible nucleotide insertion 

• 






• 




WHCRLLFQLAQLHTLEKDLVSACDLLGVGAEY 

ARVVGSEYTRALFLLSKGMLLLMERKLQEVHPL 

LTLCGQIVENWQGNPIQKESLRVFFLVLQVTHYL 

DAGQVKSVKPCLKQLQQCIQTISTLHDDE1LPSNP 

ADLFHWLPKEHMCVLVYLVTVMHSMQAGYLE 

KAQKYTDKALMQLEKLKMLDCSPILSSFQVILLE 

HIIMCRLVTGHKATALQEISQVCQLCQQSPRLFS 

NHAAQLHTLLGLYCVSVNCMDNAEAQFTTAJLR 

LTNHQELWAFIVTNl^SVYIREGMUiQEVVVLYS 

LLERINPDHSFPVSSHCLRAAAFYVRGLFSFFQGR 

YNEAKRFLRETLKMSNAEDLNRLTACSLVLLGHI 

FYVLGNHRESKNMVVPAMQLASKIPDMSVQLW 

SSALLRDLNKACGNAMDAHEAAQMHQNFSQQL 

LQDHIEACSLPEHNLITWTDGPPPVQFQAQNGPN 

TSLASLL 


3494 


A 

• 


2 

• 


1615 

* 


VLRGQRGPAGGLAEERRRGRNEWRfflDVTTAPF 

PGLVQRRSRLLIVSQVRYFLKNKVSPDLCNEDGL 

TA JLHQCCBDNFEEIVKT J ,1 ,SHGANVNAKDNELW 

TPLHAAATCGHINLVKJLVQYGADLLAVNSDGN 

MPYDLCEDEPTLDVTETCMAYQGITQEKINEMRV 

APEQQMIADmCMIAAGQDLDWIDAQGATLLHI 

AGANGYLRAAELLLDHGVRVDVKDWDGWEPL 

HAAAFWGQMQMAELLVSHGAhALNARTSMDE 

NfPIDLCEEEEFKVLLIJBIJC\HKJHDDVIMKSQLJUK 

SSLSRRTSHRQAS/SVGKVVRRTQPVGTGPNLWR 

KEYE/GEEAILWQRSA\AEDQRTSTYNGDIRET\R 

TDQENKDPNPRLEK\PVLLSEFPTKIPRGELDMPV 

ENGLRAPVSAYQYAJLANGDVWKVHEVPDYSM 

AYGNPGVADATPPWSSYKEQSPQTLLELKRQRA 

AAKLLSHPFLSTHLGSSMARTGESSSEGKAPUG 

GRTSPYSSNGTSVYYTVTSGDPPLLKFKAPIEEM 

EEKVHGCCRIS 


3495 


A 


327 


1078 


APMADTTPNGPQGAGAVQFMMTNKLDTAMWL 

SRLFTVYCSALFVLPLLGLHEAASFYQRALLANA 

LTSALRLHQRLPHFQLSRAFLAQALLEDSCHYLL 

YSLIFVNSYPVTMSIFPVU1.FSLLHAATYTKKVL\ 

DARG\SNSLPLLR\SVLDKLSANQQNILKFIACNEI 

FLMPATVFMLFSGQGSLLQPFIYYRFLTLRYSSRR 

NPYCRTLFNEIJUVVEHIIMKPACPLFVRRLCLQS 

IAFISRLAPTVP 


3496 


A 


3 


2867 


SSRTREMERKEILRRQIRLLQGLIDDYKTLHGNAP 

APGTPAASGWQPPTYHSGRAFSARYPRPSRRGYS 

SHHGPSWRKKYSLVNRPPGPSDPPADHAVRPLH 

GARGGQPPVPQQHVLERQVQLSQGQNWIKVKP 

PSKSGSASASGAQRGSLEEFEDTTWSDQRPREGE 

GEPPRGQLQPSRPTRARGTCSVEDPLLVCQKEPG 

KPRMVKSVGSVGDSPREPRRTVSESVIAVKASFP 

SSALPPRTGVALGRKLGSHSVASCAPQLLGDRRV 

DAGHTDQPVPSGSVGGPARPASGPRQAREASLV 

VTCRTNKFRKNNYKWVAASSKSPRVARRALSPR 

VAAENVCKASAGMANKVEKPQLIADPEPKPRKP 

ATSSKPGSAPSKYKWKASSPSASSSSSFRWQSEA 

GSKDHASQLSPVLSRSPSGDVRPALAHSGLKPLSG 

ETPLSAYKVKTRTKURRRGSTSLPGDKKSGTSPA 

ATAKSH1^LRRRQALRGKSSPVLKKTP>JKGLVQ 
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j SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCysteine, J>»Aspartic Add, 
E^Gl atomic Acid, ^Phenylalanine, G=Clydne, H-Histidine, 
Msoleudne, K=Lyslnc, L»Leudne, M»Methlonlne, 
N=Asparagine, P=»Proline, Q=Glutamine, R^Arginine, S^erine, 
T«=Threonine, V«=Valine, W«Tryptophan, Y«Tyrosine, 
X=Unknown, *=*Stop codon, /^possible nodeotide deletion, 
V=possiblc nucleotide insertion 


i 








VTKHRLCRLPPSRAHLPTKEASSLHAVRTAPTSK 

VIKTRYRIVKKTPASPLSAPPFPLSLPSWRARRLS 

LSRSLVLNRLRPVASGGGKAQPGSPWWRSKGYR 

CIGGVLYKVSANKLSKTSGQPSDAGSRPLLRTGR 

LDPAGSCSRSLASRAVQRSLAHRQARQRREKRK 

EYCMYYNRFGRCNRGERCPYIHDPEKVAVCTRF 

VRGTCKKTDGTCPFSHHVSKEKMPVCSYFLKGI 

CSNSNCPYSHVYVSRKAEVCSDFLKGYCPLGAK 

CKKKHTLLCPDFARRGACPRGAQCQLLHRTQKR 

HSRRAATSPAPGPSDATARSRVSASHGPRKPSAS 

QRPTRQTPSSAALTAAAVAAPPHCPGGSASPSSS 

KASSSSSSSSSPPASLDHEVAPSLQEAALAAACSN 

RLCKLPSFISLQS SPSPGAQPRVRAPRAPLTKDSG 

KPLHUCPRL 


3497 


A 


1586 


141 


ATARDLGCARRIDRWMESTPSRGLNRVHLQCR 

NLQEFLGGLSPGVLDRLYGHPATCLAVFRELPSL 

AKNWVMRMLFLEQPLPQAAVALWVKKEFSKA 

QEESTGLLSGLRIWHTQLLPGGLQGLILNP1FRQN 

IJUALLGGGKAWSDDTSQLGPDKHARDVPSLDK 

YAEERWEVVLHFMVGSPSAAVSQDLAQLLSQA 

GLMKSTEPGEPPCITSAGFQFLLLDTPAQLWYFM 

LQYLQTAQSRGMDLVEILSFLFQLSFSTLGKDYS 

VEGMSDSLLNFLQHLREFGLVFQRKRKSRRYYP 

T/RALAINLSSGVSGAGGTVHQPGFTVWETNYRL 

YAYTESELQIALIALFSEMLYPFP\NMVV\ARVTR\ 

ESVQQA1ASGITAQQIIHFLRTRAHPVMLKQTPVL 

PPTTTDQIRJLWELERDRLRFTEGVLYNQFLSQVDF 

ELL\LAHAPKLG\^VFE/NTPAKRLMVVTPAGHS 

DVKRFWKRQKHSS 


3498 


A 


790 


190 

• 


RDLGPAALMTAS AS SFSSSQG VQQPSIYSFSQITR 

SLFLSNGVAANDKLLLSSNRTTAIVNASVGSGQRI 

LRG\LQYIKVPVTDARDSRLYDFFDP1ADLIHTVS 

MRQGRTLLNCMAGXMSRSASLCLAYLMKYHSM 

S\LLDAHTWA/TKSRRPIIRPNNGFWEQLINYEFK 

LFN^^^^IVRMINSPVGNIPDIYEKDLRMMISM 


3499 


A 


31 


1586 

• 


TAGFLLAPLEMQRLLTPVKRILQLTRAVQETSLT 

PARLLP VAHQRF STA S A VPLA KTDTWPKJD VGIL 

ALEVYFPAQYVDQTDLEKYNNTVEAGKYTVGLG 

QTRMGFCSVQEDINSLCLTWQRLMERIQLPWD 

SVGRLEVGTETIIDKSKAVKTVLMELFQDSGNTD 

IEGIDTTNACYGGTASLFNAANWMESSSWDGRY 

AMWCGDIAVYPSGNARPTGGAGAVAMLIGPK 

APLAI^RGLRGTHMENVYDFYKPmASEYPIVD 

GKLSIQCYLRALDRCYTSYRKKIQNQWKQAGSD 

RPFTLDDLQYMIFHTPFCKMVQKSLARLMFNDF 

LSASSDTQTSLYKGLEAFGGLKLEDTYTNKDLD 

KALLKASQDMFDKKTKASLYLSTHNGNMYTSSL 

YGCLASLLSHHSAQELAGSRIGAFSYGSGLAASF 

FSFRVSQDAAPGSPLVDKLVSSTSDLPKRLASRKC 

VSPEEFTEIMNQREQFYHKVNFSPPGDTNSLFPGT 

WYLERVDEQHRRKYARRPV 


3500 


A 


185 


2692 


MLPTE VPQS HPGPS ALLLLQLLLPPTSAFFPN1 WS 
Llj^^GSITHQDLTEEAALNVTLQLELEQPPPGRP 
PLRLEDFLGRTLLADDLFAAYFGPGSSRRFRAAL 
GEVSRANAAQDFLPTSRNDPDLHFDAERLGQGR 
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SEQLD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine 0=Cysteine, D=Aspnrt1c Add, i 
E-Glutamic Add, FHPhenylalanine, G=Glyclne, H^Histidlne, 
I=Isoleucine, KF-Lysinc, L^Leudne, M=Methionine, 
N»Asparagine, P=Proline, Q^GIutaraine, R-Argininc, S=Scrinc, 
T-Threonine, V^Valine, W=Tryptopban, Y«Tyrosine, 
X*=Unknown, *«=Stop cod on, /^possible nudebtide deletion, 
Vspossible nucleotide insertion 










ARLVGALRETVVAARALDHTLARQRLGAALHA 

LQDFYSHSNWVELGEQQPHPHLLWPRQELQNLA 

QVADPTCSDCEELSCPRNWLGFTLLTSGYFGTHP 

PKPPGKCSHGGHFDRSSSQPPRGGINKDSTSPGFS 

PHHMLHLQAAKL ALLA SIQAFSLLRSRLGDRDFS 

RLLDITPASSLSFVLDTTGSMGEEINAAKIQARHL 

VEQRRGSPMEPVHYVLVPFHDPGFGPVFTTSDPD 

SFWQQLNEIHALGGGDEPEMCLSALQLALLHTPP 

LSDIFVK1 UASPKDAFLTNQVESLTQERRCRVTFL 

VTEDTSRVQGRARREILSPLRFEPYKAVALASGG 

EV1F1KDQHIRDVAAIVGESMAALVTLPLDPPVV 

VPGQPLVFSVDGLLQKITVRIHGDISSFWIKNPAG 

VSQGQEEGGGPLGHTRRFGQFWMVTMDDPPQT 

GTWEIQVTAEDTPGVRVQAQTSLDFLFHFG1PME 

DGPHPGLYPLTQPVAGLQTQLLVEVTGLGSRAN 

PGDPQPHFSHVILRGVPEGAELGQVPLEPVGPPE 

RGLLAASLSPTLLSTPRPFSLELIGQDAAGRRLHR* 

AAPQPSTWPVLLELSGPSGFLAPGSKVPLSLRIA 

SFSGPQDLDLRTFVNPSFSLTSNLSRAHLELNESA 

WGRLWLEVPDSAAPDSVVMVTVTAGGREANPV 

PPTHAPLRLLVSAPAPQDRH 


3501 


A 


1245 


5815 

* 


RRAHPSHSRLSP YLS V SRDP YFFVTVSRTILTLS A 

PAPPRRTPAPSMGTALLQRGGCFLLCLSLLLLGC 

WAELGSGLEFPGAEGQWTRFPKWNACCESEMSF 

QLKTRSARGLVLYFDDEGFCDFLELILTRGGRLQ 

LSFSIFCAEPATLLADTPVNDGAWHSVRIRRQFR 

N7TLFIIXJVEAKWVEVKSKRRDMTVFSGLFVGG 

LPPELRAAALKLTLASVREREPFKGWIRDVRVNS 

SQVLPVDSGEVKLDDEPPNSGGGXSPCEAGEEGE 

GGVClJNGGVCSWDDQAVCDCSRTGFRGKDCS 

QEDNNVEGLAHLMMGDQGKEEYIATFKGSEYF 

CYDLSQNPIQSSSDEITLSFKTLQRNGLMLHTGKS 

ADYVNLALKNGAVSLVINLGSGAFEALVEPVNG 

KFNDNAWHDVK\nTRNLRQHSGIGHAMVTISVD 

GILTTTGYTQEDYTMLGSDDFF YVGG SPSTADLP 

GSPVSNNFMGCLKEVVYKNNDVRLELSRLAKQ 

GDPKMKIHGWAFKCENVATLDPITFETPESFISL 

PKWNAKKTGSISFDFRTTEPNGLILFSHGKPRHQ 

KDAKHPQMEKVDFFAIEMLDGHLYLLLDMGSGT 

IKIKALLKKVNDGEWYHVDFQRDGRSGTISVNT 

LRTPYTAPGESEILDLDDELYLGGLPENKAGLVF 

PTEVWTALLNYGYVGCIRDLFIDGQSKDIRQMA 

EVQSTAGVKPSCSKETAKPCLSNPCKNNGMCRD 

GWNRYVCDCSGTGYLGRSCEREATVLSYDGSM 

FMKIQLPWMHTEAEDVSLRFRSQRAYGILMAT 

TSRDSADTLRLELDAGR VKLTVNLDCIRJNCNS S 

KGPETLFAGYNLNDNEWHTVRWRRGKSLKLT 

VDDQQAMTGQMAGDHTRLEFHNIETGUTERRY 

LSSVPSNFIGHLQSLTFNGMAYIDLCKNGDIDYC 

ELNARFGFRNIIADPVTFKTKSSYVALATLQAYT 

SMHLFFQFKTTSLDGLILYNSGDGNDFIWELVK 

GYLHYVFDLGNGANLIKGSSNKPLNDNQWHNV 

MISRDTSNLHTVKIDTK1TTQITAGARNLDLKSDL 

YIGGVAKETYKSLPKLVHAKEGFQGCLASVDLN 

G\RLP\DLISDGSFSCNGTDSRRGMWKGPSn\CQ 
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SEQ W 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D=Aspartic Acid, 
&=Glutamic Add, ^Phenylalanine, G^GIycfne, H=Histtdioe, 
I»Isoleucioe, K=Lysine, L=Leucine, M^Methlouine, 
NsAsparagine, P=*Proline, Q=GIutamine, R«Arginine, S"Serine, 
^Threonine, V«Valiue, W«Tryptophao, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










EDSCSNQGVCLQQWDGFSCDCSMTSFSGPLCND 

PGTTYIFSKGGGOITYKWPPNDRPSTRADRLAIGF « 

STVQKEAVLVRVDSSSGLGDYLELHIHQGKIGVK 

FNVGTDDIAffiESNAJONDGKYHVVRFTRSGGNA 

TLQVDSWPVIERYPAGRQLTIFNSQATIIIGGKEQ 

GOPFOGOI^GLYYNGLKVLNMAAEhn^ANIAIVG 

KVRLVGEVPSSMTTESTATAMQSEMSTSIMETTT 

TLATSTARRGKPPTKEPISQTTDDILVASAECPSD 

DEDIDPCEPSSGGLANFIKAGGREPYPGSAEVIRE 

SSSTTGMWGWAAAALCILE.LYAMYKYRNRDE 

GSYHVDESRNYISNSAQSNGAWKEKQPSSAKSS 

NKNKKNKDKEYYV 


3502 


A 


394 


72 


KPAHLPFTVUMPKRKPSEGAMSDKVKA/KFELQ 
RRSAGLFSKPTPPKPETRPKKDPANQRQKLPKVR 
KGKADA/SKEGNSPAEERCSMVQTQKVEGWRSG 
SELPVALSF | 


3503 


A 


43 


3358 


SGGRGPVRVRSEQLSPSAEQVS Q1SQISLGRRPLS 

SLPPPPSRALAPTRAPDTALTIMEVAEVESPLNPS 

CKJMTFRPSMEEFREFNKYLAYMESKGAHRAGL 

AKVTPPKEWKPRQCYDDIDNLLIPAPIQQMVTGQ 

SGLFTQYNIQKKAMTVKEFRQLANSGKYCTPRY 

LDYEDLERKYWKNLTFVAPIYGADINGSIYDEGV 

DEWNIARLNTVLDVVEEECGISIEGVNTPYLYFG 

MWKTTFAWHTEDMDLYSINYLHFGEPKSWYAIP 

PEHGKRLERLAQGFFPS SSQGCDAFLRHKMTLIS 

PSVLKKYGIPFDKITQEAGEFMITFPYGYHAGFN 

HGFNCAESTOTATVRW1DYGKVAKLCTCRKDM 

VKJSMDIFV^RKFQPDRYQLWKQGKDIYTIDHTKP 

TPASTPEVKA WLQRRRKVRKA SRSFQC ARSTSK 

RPKADEEEEVSDEVDGAEVPNPDSVTDDLKVSE 

KSEAAVKLRNTEASSEEESSASRMQVEQNLSDHI 

KLSGNSCLSTSVTEDIKTEDDKAYAYRSVPSISSE 

ADDSIPLSTGYEKPEKSDPSELSWPKSPESCSSVA 

ESNGVLTEGEESDVESHGNGLEPGEIPAVPSGER 

NSFKWSIAEGENKTSKSWRHPLSRPPARSPMTL 

VKQQAPSDEELPEVLSIEEEVEETESWAKPLIHL 

WQTKPPNFAAEQE YNATVARMKPHCA ICTLLMP 

YHKPDS S>nSENDARWETKLDE VVTSEGKTKPLEP 

EMCFTYSEENIEYSPPNAFLEEDGTSLLISCAKCC 

VRVHASCYGIPSHEICDGWLCARCKRNAWTAEC 

CLChn-RGGAJJCQTKNNKWAHVMCAVAVPEVR 

FTNVPERTQIDVGRIPLQRLKLKCIFCRHRVKRVS 

GACIQCSYGRCPASFHVTCAHAAGVLXMEPDDW 

PYVVhnTCFRHKVNPKVKSKACEKVISVGQTVIT 

KHR>nRYYSCRVMAVTSQTFYEVMFDDGSFSRD 

TFPEDIVSRDCLKLGPPAEGEWQVKWPDGKLY 

GAKYFGSNIAHMYQVEFEDGSQIAMKJREDIYTL 

DEELPKRVKARF VSAGRCHLGTCQ VNSLS SPHVS 

QAQQETYLGFWINSKKSQCNIFLSGrY 


3504 


A 


1124 


139 


RGEBQFDAEFRRFACLGFGERLQEFSRLLRAVHR 

SRAWTCYl^IRMLMATCCPSPTTTACTGPWQRA 

PPLRLLVQKREADSSGLAFASNSLQRRKKGLLLR 

PVAPLRTRPPLLISLPQDFRQy SSVIDVDLLPETH 

RRVRLHKHGSDRPLGFYIRDGMSVRVAPQGVLER 

VPGIFISRLVRGGLABSTGLLAVSDEILEVNG1EV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»AlanIne OCysteine, D=>Aspartic Add, 
E^Glutamic Acid, ^Phenylalanine, G=Glycine, H-Hishdme, 
l»Iso!euclne, K^Lyslne, L^Leucine, {^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R»Arginine, S^Serine, 
T«=nThreonine, V«Valine f W=Tryptophan, Y«Tyrosine f 
X=lfnknown, *«Stop codon, A=possible nucleotide deletion, 
V=possible nucleotide insertion 










AGKTLNQVTDMMVANSHNVLIVTVKPANQRNN 
VVRGASGRLTGPPSAGPGPAEPDSDDDSSDLVIE 
NRQPPSSNGLSQGPPCWDLHPGCRHPGTRSSLPS 
LDDQEQASSGWGSRIRGDGSGFSL 


3505 


A 


3 


2898 


SCRSATSQSGCGGGRSWLCSSLKMAAQPPRGIRL 

SALCPKFLHTNSTSHTWPFSAVAELBDNAYDPDV 

NAKQIWIDKTVINDHICL'rKrDNGNGMTSDKLH 

KND^FGFSDKVTMNGHVPVGLYGNGFKSGSMVR 

LGKDAIVFTKNGESMSVGLLSQTYL\EVIKAEHV 

VWIVAF>nCHRQMINIj\ESKASLAAn^EHSLFSTE 

QKLLAELDAnGKKGTRIOWNLRSYKNATEFDFE 

KDKYDIRIPEDLDEITGKKGYKKQERMDQIAPES 

DYSLRAYCSILYLKPRMQnLRGQKVKTQLVSKS 

LAYTERDVYRPKFLSKTVRITFGFNCRNKDHYGI 

MMYHRNRLIKA YEKVG CQLRANNMG VG VVGII 

ECNFLKPTHNKQDFDYTNEYRLTITALGEKLND 

YWNEMKVKKNTEYPLNLPVEDIQKRPDQTWVQ 

CDACLKWRKLPDGMDQLPEKWYCSNNPXDPQFR 

NCEVPEEPEDEDLYHPTYEKTYKXTNKEKFRIRQ 

PEMIPRINAELLFRPT\ALSTPS\FSSPKESVSKR/RH 

LSEGTNSYATRLLNNHQVPPQSEPESNSLKRRLS 

TRSSILNAKNRRL\SSQF\ENSVYKG\DDDDEDVII 

LEENSTPKPAVDHDIDMKSEOSHVEOGGVOVEF 

VGDSEPCGQTGSTSTSSSRCD QGNTAATQTEVPS 

LVVKKEETVEDEIDVRNDAVTLPSCVEAEAKIHE 

TQETTDKSADDAGCQLQELRNQLLLVTCEKENY 

KRQCHMFTDQIKVLQQRILEMNDKYVKKETCH 

QSTETDAVFLLESINGKSESPDHMVSQYQQALEE 

EERLKKQCSALQHVKAECSQCSNNESKSEMDEM 

AVQLDDVFRQLDKCSIERDQYKSEVELLEMEKS 

QIRSQCEELKTEVEQLKSTNQQTATDVSTS SNIEE 

SVNHMIX3ESLKLRSLRVNVGQLLAMIVPDLDLQ 

QVNYDVDWDEILGQWEQMSEISST 


3506 


A 

• 


2 


2120 


RPPEAGGRYRAGGRRQAAKPSRPPLPSRRRLPQG 

GRTRRAMDRPAAAAAAGCEGGGGPNPGPAGGR 

RPPRAAGGATAGSRQPSVETLDSPTGSHVEWCK 

QLIAAT1SSQISGSVTSENVSRDYKALRDGNKLA 

QMEEAPLFPGESIKAIVKDVMYICPFMGAVSGTL 

TVTDFKLYFKNVERDPHFILDVPLGVISRVEKIGA 

QSHGDNS CGBBIVCKDMRNLRL AYKXQEEQSKLG 

IFENLNKHAFPI^NGQAI^AFSYKEKFPINGWKV 

YDPVSEYKRQGLPNESWKISKINSNYEFCDTYPA 

HVVPTSVKDDDLSKVAVFI^GRVPVLSWIHPE 

SOATITRCSOPLVGFNDKRCKEDEKYLQTIMDAN 

AQSHKLIIFDARQNSVADTNKTKGGGYESESAYP 

NAELVFLBIHNIHVMRESLRKLKEIVYPSIDEARW 

l^NVDGTHWIJEYIRMLLAGAVRIADKIESGKTSV 

VVHCSDGWDRTAQLTSLAMLMLDSYYRTIKGFE 

TLVEKEWI SFGHRFALRVGHGNDNHAD ADRSPIF 

LQFVDC V WQMTRQFP S AFEFNELFLITILDHL Y S 

CLFGTFLCNCEQQRFKEDVYTKTISLWSYINSQL 

DEFSNPFFVNYE>mVLYPVASLSHLELWVNYYV 

RWNPRMRPQMPIHQNLKELLAVRAELQKRVEG 

LQREVATRAVSSSSERGSSPSHFATSVHTLV 


3507 


A 


1 


2169 


GSSIKIRLTN^CAKNLAKKDFFRLPDPFVAKIVVD 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCystcine, D^Aspartic Acid, 
E«Glutamic Acid, ^Phenylalanine, G«GIycine, H=Histidine, 
I=Iso leu cine, K a Lysine, L^Leuclne, M=Methf onine, 
N=Asparagine, P=Prolinc, C^GIutamine, R=Arginine, S=Serine, 
T«Threonine, V=Valine, W»Tryptophan, Y«Tyroslne, 
X^Unknown, *=^top codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










GSGQCHSTDTVKNTLDPKWNQHYDLYVGKTDSI 

TISVWNHKKIHKKQGAGFLGCVRLLSNAISRLKD 

TGYQRLDLCKLNPSDTDAVRGQIWSLQTRDRIG 

TGGSWDCRGLLENEGTVYEDSGPGRPLSCFME 

EPAPYTDSTGAAAGGGNCRFVESPSQDQRLQAQ 

RLRNPDVRGSLQTPQNRPHGHQSPELPEGYEQRT 

TVQGQVYFLHTQTGVSTWHDPRJDPRDLNSVNCD 

ELGPLPPGWEVRSTVSGRJYFVDHNNRTTQFTDP 

RLHHIMNHQCQLKEPSQPLPLPSEGSLEDEELPA 

QRYERDLVQKLKVLRHELSLQQPQAGHCRJEVS 

REEIFEESYROIMKJV1RPKDLKKRLMVKFRGEEG 

LDYGGVAREWLYLLCHEMLNPYYGLFQYSTDNI 

YMLQINPDSSINPDHLSYFHFVGRJMGLAVFHGH 

YINGGFTVPFYKQLLGKPIQLSDLESVDPELHKSL 

VWILENDITPVLDHTFCVEHNAFGRILQHELKPN 

G\RNWVTEENKICEYVRLYVNWRFMRGIEAQFL 

ALQKGFNELIPQHLLKPFDQKELELIIGGLDKIDL 

NDWKSNTRLKHCVADSNIVRWFWQAVETFDEE 

RRARLLQFVTG STRVPLQGFKALQGSTGXAAGPR 

LFTIHLIDANTDNLRKAHTCFNRIDIPPYESYEKL 

YEKLLTAVEETCGFAVE 


3508 


A 


3 


6388 

• 


DLYINPADLGWNPPVSSWIEKREIQTERANLTILF 

DKY1JTCLDTLRTRFKKIIPIPEQSMVQMVCHLLE 

CLLTTEDIPADCPKEIYEHYFVFAAIWAFGGAMV 

QDQLVDYRAEFSKWWLTEFKTVKFPSQGTEFDY 

YIDPETKKFEPWSKXVPQFEFDPEMPLQACLVHT 

SETIRVCYFMERLMARQRPVMLVGTAGTGKSVL 

VGAKLASLDPEAYLVKNVPFNYY'l'l'SAMLQAVL 

EKPLEKXAGRNYGPPGNKXLIYFIDDMNMPEVD 

AYGTVQPHTTIRQHLDYGHWYDRSKJLSLKEITNV 

QYVSCMNPTAGSFTINPRLQRHFSVFVLSFPGAD 

ALS SIYSTILTQHLKJLGNFP ASLQKSIPPLEDLAL AF 

HQKIA'riFLPTGIKFHYIFNLRDFANIFQGILFSSV 

ECVKSTWDURLYLHESNRVYRDKMVEEKDFDL 

FDKIQTEVLKKTFDDIEDPVEQTQSPNLYCHFAN 

GIGEPKYMPVQSWELLTQTLVEALENHNEVNTV 

MDL VLFEDAMRHV CHINRILESPRGNALL VG VG 

GSGKQSLTRLAAFISSMDVFQITLRKGYQIQDFK 

MDLASLCLKAGVKNLNTVFLMTDAQVADERFL 

VLINDLLASGEIPDLYSDDEVENnSNVRNEVKSQ 

GLVDNRENCWKFFTORIRRQLKVTLCFSPVGNKL 

RVRSRJCFPAIVNCTAIHWFHEWPQQALESVSLRF 

LQNTEGIEPTVKQSISKFMAFVHTSVNQTSQSYLS 

NEQRYNYTTPKSFLEFIRLYQSLLHRHRKELKCK 

TERLENGLLKLHSTSAQVDDLKAKLAAQEVELK 

QKNEDADKLIQWGVETDKVSREKAMADEEEQ 

KVAVIMLEVKQKQKDCEEDLAKAEPALTAAQA 

ALNTLNKTNLTELKSFGSPPLAVSNVSAAVMVL 

MAPRGRWKDRSWKAAKVTMAKVDGFLDSLIN 

FNKENIHENCUCAIRPYLQDPEFNPEFVATKSYA 

AAGLCSWVENIVRFYEVFCDVEPKRQALNKATA 

DLTAAQEKlAAIKAKIAHI^IEhn^AKLTARFEKA 

TADKLKCQQEAEVTAVTISLANRLVGGLASENV 

RWADAVQNFKQQERTLCGDILL1TAFISYLGFFT 

KXYRQSLLDRTWRPYLSQLKTPIPVTPALDPLRM 



363 



WO 01/57190 PCIYUS01/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add rtsJduc of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alnnine OCysteine, D=Aspartic Acid, 
i>=GJntamic Add, ^Phenylalanine, G=G)ydne, H»Histidine t 
I a Isoleudnc, K«=Lyslne, L^Leudne, M=Methionine, 
N=Asparoglne, P=Proline, Q=GIutamine, R»Arginine, S=*Serine, 
T=»Threonine, VwValine, W^ryptophan, Y»Tyrosine, 
X»Unknown, *=Stop codon, Impossible nucleotide deletion, 
Vspossible nocJeoride Insertion 










LMDDADVAAWQNEGLPADRMSVENATILINCE 
RWPLMVDPQLQGIKWIKNKYGEDLRVTQIGQKG 
YLQHEQALEAGAVVLIENLEESIDPVLGPLLGRE 
VIKKGRFIKIGDKECEYNPKFRLILHTKLANPHYQ 
PELQAQATLINKrVTRDGLEDQLLAAVVSMERP 
DLEQLKSDLTKQQNGFKTTLKTLEDSLLSRLSSAS 
GNFLGETVLVENLEITKQTAAEVEKKVQEAKVT 
EVKINEAREHYRPAAARASLLYFIMNDLSKIHPM 
YQFSLKAFSIVFQKAVERAAPDESLRERVANLID 
SITFSVYQYTIRGLFECDKLTYLAQLTFQILLMNR 
EVNAVELDFLLRSPVQTGTASPVEFLSHQAWGA 
VKVLSSMEEFSNLDRDIEGSAKSWKKFVESECPE 
KEKLPQEWKNKTALQRLCMLRAMRPDRMTYAL 
RDFVEEKLGSKYVVGRALDFATSFEESGPATPMF 
FILSPGVDPLKDVESQGRKLGYTFNNQNFHNVSL 
GQGQEWAEAALDLAAKKGHWVILQNTLEMCS 
RETEFKSILFALCYFHAWAERRKFGPQGWNRSY 
PFNTGDLTISVNVLYNFLEANAKVPYDDLRYLFG 
EIMYGGHITDDWDRRLCRTYLGEFIRPEMLEGEL 
1 SLAPGFPLPGNMDYNGYHQYIDAELPPESPYLYG 
LHPNAEIGFLTQTSEKLFRTVLELQPRDSQARDG 
AGATREEKVKALLEEILERVTDEFNIPELMAKVE 
ERTPYIWAFQECGRMNILTREIQRSLRELELGLK 
GELTMTSHMENLQNALYFDMVPESWARRAYPS 
TAGLAAWFPDLLNRIKELEAWTGDFTMPSTVWL 
TGFFNPQSFLTAIMQSTARKNEWPLDQMALQCD 
MTKKNREEFRSPPREGAY1HGLFMEGACWDTQA 
GHTEAKLKDLTPPMPVMFIKAIPADXRQDCGHVY 
SCPVTKTSQVRDPTYVWTFNLKTKENPSKWVLA 
GVALLLQI 


3509 


A 

• 


3 

# 


6388 


ILYINPADLGWNPPVSSWIEKREIQTERANLTILF 

DKYLPTCLDTLRTRFKKIIPIPEQSMVQMVCHLLE 

CLLTTEDIPADCPKEIYEHYFVFAAIWAFGGAMV 

QIXJLVDYRAEFSKWWLTEFKTVKFPSQGTIFDY 

YIDPETKKFEPWSKLVPQFEFDPEMPLQACLVHT 

SETIRVCYFMERLMARQRPVMLVGTAGTGKSVL 

VGAKLASLDPEAYLVK>TVTFhr^YTrSAMLQAVL 

EKPLEKKAGRNYGPPGNKKLIYFIDDMNMPEVD 

AYGTVQPHTIIRQHLDYGHWYDRSKLSLKEITNV 

QYVSCMNPTAGSFTTNPRLQRHFSVFVLSFPGAD 

ALSSIYSIILTQHLKLGNFPASLQKSIPPLIDLALAF 

HQKIATTFLPTGlKFHYIFmRDFANIFQGILFSSV 

ECVKSTWDLERLYLHESNRVYRDKMVEEKDFDL 

FDKIQTEVLKKTFDDffiDPVEQTQSPNLYCHFAN 

GIGEPKYMPVQSWELLTQTLVEALENHNEVNTV 

MDL VLFEDAMRHV CHINRILESPRGNALLVG VG 

GSGKQSLTRLAAFISSMDVFQITLRKGYQIQDFK 

MDIj^SLCLKAGVKNLNTVFLMTDAQVADERFL 

VLINDLLASGEIPDLYSDDEVENHSNVRNEVKSQ 

GLVDNRENCWKFFIDRIRRQLKVTLCFSPVGNKL 

RVRSRKFPAIVNCTAmWFHEWPQQALESVSLRF 

LQ>HBGIEPTVKQSISKFMAFVHTSVNQTSQSYLS 

NEQRYNYTTPKSFLEFIRL YQ SLLHRHRKELKCK 

raRJLENGLLKLHSTSAQVDDLKAKLAAQEVELK 

QKNEDADKLIQVVGVETDKVSREKAMADEEEQ 
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SEQID I Method 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino add sequence (A= Alanine OCysteine, D=»Asparlic Acid, 
£-G!utamic Add, ^Phenylalanine, G=Clydne, H=Histidine, 
I s Isolcudne, K-Lysinc, L^Leudne, M^Methlonine, 
N=Asparagine, P-Proline, Q*»Glutomine, R^Arginine, S»Serine, 
T»Threonine, V^Valine, W-Tryptophan, Y«Tyrosine, 
X a Unknown, *=Stop codon, /"^possible nudeotide deletion, 
\=possib!e nudeotide insertion 



3510 



390 



3330 



KVAV1MLEVKQKQKDCEEDLAKAEPALTAAQA 

ALNTLNKTNLTELKSFGSPPLAVSNVSAAVMVL 

MAPRGRVPKDRSWKAAKVTMAKVDGFLDSLIN 

FNKENIHENCLKAIRPYLQDPEFNPEFVATKSYA 

AAGLCSWVmiVRFYEVFCDVEPKRQALNKATA 

DLTAAQEKLAAIKAKIAHLNENLAKLTARFEKA 

TADKLKCQQEAEVTAVTISLANRLVGGLASENV 

RWADAVQNFKQQERTLCGDILLITAFISYLGFFT 

KKYRQSLLDRTWRPYLSQLKTPIPVTPALDPLRM 

LMDDADVAAWQNEGLPADRMSVENATILINCE 

RWPLMVDPQLQGIKWIKNKYGEDLRVTQIGQKG 

YLQHEQALEAGAWLIENLEESIDPVLGPLLGRE 

VIKKGRJTKIGDK£CEYOTKFRIJa-rrnCLANPHYQ 

PELQAQATLINFTVTRDGLEDQLL AA W SMERP 

DLEQLKSDLTKQQNGFKITLKTLEDSLLSRLSSAS 

GNFLGETVLVENLErnCQTAAEVEKKVQEAKVT 

EVKINEAREHYRPAAARASLLYFIMNDLSKIHPM 

YQFSLKAFSIVFQKAVERAAPDESLRERVANLID 

SITFSVYQYTIRGLFECDKLTYLAQLTFQILLMNR 

EVNAVELDFLLRSPVQTGTASPVEFLSHQAWGA 

VKVLSSMEEFSNLDRDIEGSAKSWKKFVESECPE 

KEKLPQEWKNKTALQRLCMLRAMRPDRMTYAL 

RDFVEEKLGSKYWGRALDFATSFEESGPATPMF 

FILSPGVDPLKDVESQGRKLGYTFNNQNFHNVSL 

GQGQEVVAEAALDLAAKKGHWVILQNTLEMCS 

RETEFKSILFALC YFHA WAERRKFGPQG WNRSY 

PFNTGDLTISVNVLYNFLEANAKVPYDDLRYLFG 

EIMYGGHITDDWDRRLCRTYLGEFIRPEMLEGEL 

SLAPGFPLPGNMDYNGYHQYIDAELPPESPYLYG 

LHPNAEIGFLTQTSEKLFRTVLELQPRDSQARDG 

AGATREEKVKALLEEILERVTDEFNIPELMAKVE 

ERTPYrVVAFQECGRMNILTREIQRSLRELELGLK 

GELTMTSHMENLQNALYFDMVPESWARRAYPS 

TAGLAAWFPDLLNRIKELEAWTGDFTMPSTVWL 

TGFFNPQSFLTAIMQSTARKNEWPLDQMALQCD 

MTKKNREEFRSPPREGAYIHGLFMEGACWDTQA 

GIITEAKLKDLTPPMPVMFIKAIPADXRQDCGHVY 

SCPVTKTSQ\RDPTYVWTFNLKTKENPSKWVLA 

GVALLLQI 



AAGSGSRPPAPAARKMADLAECNIKVMCRFRPL 

NESEVNRGDKYIAKFQGEDTVV1ASKPYAFDRVF 

QSSTSQEQVYNDCAKKJVKDVLEGYNGTIFAYG 

QTSSGKTHTMEGKLHDPEGMGIIPRIVQDIFNYIY 

SMDENLEFHDCV S YFEIYLDKIRDLLDVSKTNLSV 

HEDKNRVPYVKGCTERFVCSPDEVMDTIDEGKS 

>nUIVAVTNMNEHSSRSHSIFLINVKQENTQTEQK 

LSGKLYLVDLAGSEKVSKTGAEGAVLDEAKNIN 

KSLSALGNVISALAEGSTYVPYRDSKMTRILQDS 

LGGNCRTTIVICCSPSSYNESETKSTLLFGQRAKTI 

KNTVCVNVELTAE Q WKKKY^KEKEKNKILRNTI 

QWLENELNRWRNGETVPIDEQFDKEKANLEAFT 

VDKDITLTNDKPATAIGVIGNFTDAERRKCEEEIA 

KLYKQLDDKDEEINQQSQLVEKXKTQMLDQEEL 

I^STRRDQDNMQAELKRLQAENDASKEEVKEV 

LQALEELAVNYDQKSQEVEPKTKEYELLSDELK 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

seqnence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A a Awnine O=Cysteine,0=Aspartic Acid, 
E>=Glutamic Acid, ^Phenylalanine, G=Glycine, H»Histidine, 
I»Isoteucine, K«Lysine, L^=Leucine, M==Methionine, 
N=Asparagine, P=Proline, Q^GIutamine, R=Arginine, S=Serine, 
T«Threonlne, V=Vallne, W=Tryptophan, Y=Tyroslne, 
X<=Unknown, *=»Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 




• 






QKSATLASIDAELQKLKEMTNHQKKRAAEMMA 

SLLKDLAEIGIAVGNNDVKQPEGTGMIDEEFTVA 

RLYISKMKSEVKTMVKRCKQLESTQTESNKKME 

ENEKELAACQ1JRISQHEAKIKSLTEYLQNVEQKK 

RQLEESVDALSEELVQLRAQEKVHEMEKEHLNK 

VQTANEVKQAVEQQIQSHRETHQKQISSLRDEVE 

AKAKLITDLQDQNQKMMLEQERLRVEHEKLKA 

TDQEKSRKLHELTVMQDRREQARQDLKGLEETV 

AKELQTLHNLRKLFVQDLATRVKKSAEIDSVDDT 

GGSAAQKQKISFLENNLEVQLTKSAQTSWYRDNA 

DLRCELPKLEKRLRATAERVKALESALKEAKEN 

ASRDRKRYQQEVDRIKEAVRSKNMARRGHSAQI 

AKPIRPG QHPAASPTHPS AIRGGGAFV QNSQPV A 

VRGGGGKQV 


3511 


A 

• 

• 


1 


1757 


MASVQASRRQWCYLCDLPKMPWAMVWDFSEA 

VCRGCVNFEGADRIELLIDAARQLKRSHVLPEGR 

SPGPPALKHPATKDLAAAAAQGPQLPPPQAQPQP 

SGTGGGVSGQDRYDRATSSGRLPLPSPALEYTLG 

SRLANGLGREEAVAEGARRALLGSMPGLMPPGL 

LAAAVSGLGSRGLTLAPGLSPARPLFGSDFEKEK 

QQRNADCLAELNEAMRGRAEEWHGRPKAVREQ 

LLALSACAPFNVRFKKDHGLVGRVFAFDATARP 

PGYEFELKLFIBYPCGSGNVYAGVLAVARQMFH 

DALREPGKALASSGFKYLEYERRHGSGEWRQLG 

ELLTDGVRSFREPAPAEALPQQYPEPAPAALCGP 

PPRAPSRNLAPTPRRRKASPEPEGEAAGKMTTEE 

QQQRHWVAPGGPYSAETPGVPSPIAALKNVAEA 

LGHSPKDPGGGGGPVRAGGASPAASSTAQPPTQ 

HRLVARNGEAEVSPTAGAEAVSGGGSGTGATPG 

APLC\CTLCRERLEDTHFVQ\CPPVPEHKFCFPCSR 

KFDCAQGPAGEWYCPSGDKCPLVGSSVPWAFMQ 

GEIATELAGDIKVKKERDP 


3512 

■ 


A 


3 


1994 


NTNSSSVTNSAAGVEDLNIVQVTVPDNEKERLSS 

IEKIKQLREQVNDLFSRKFGEAIGVDFPVKVPYR 

KITFNPGCWIDGMPPGWFKAPGYLEISSMRRIL 

EAAEFIK3TVIRPLPGLELSNGEYSTVGKRKIDQE 

GRVFQEKWERAYFFVEVQNISTCLICKRSMSVSK 

EY>^RRHYQTNHSKHYDQYMERMRDEKLHELK 

KGLRKYLLGLSDTECPEQKQVFANPSPTQKSPVQ 

PVEDLAGl^WEKLREKIRSFVAYSIAIDEITDINN 

TTQLAIFIRGVDENFDVSEELLDTVPMTGTKSGN 

EIFSRVEKSLKNFCINWSKLVSVASTGTPPMVDA 

NNGLVTCLKSRVATFCKGAELKSICCIIHPESLGA 

QVKLKMDHVMDVVVKSWWICSRGLNHSEFTTL 

LYELDSQYGSLLYYTEIKWLSRGLVLKRFFESLE 

ED)SFMSSRGKPLPQLSSIDWIRDLAFLVDMTMH 

LNALNISLQGHSQIVTQMYDLIRAFLAKLCLWET 

HLTRJWLAHFPTLKLVSRNESDGIJ^YIPKI^ 

TEFQKRLSDFKLYESELTLFSSPFSTKIDSVHEELQ 

MEVTOLQCNTVLKTKYDKVGIPEFYKYLWGSYP 

KYKHHCAKILSMFGSTYICEQLFSIMKLSKTKYC 

SQLKDSQWDSVLHIAT 


3513 


A 


1836 


513 


FKSLLSVKWFCFSELVLIFLGTRCYWEMTQSRPSP 
DPHRGRWEGGRSRPKGGEEGRRRTRVPGLVTAS 
GPGNP1JDRLGEMAGGRHRRVVGTLHLLLLVAA 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=AIanine OCysteine, D»Aspartic Acid, 
E«Glutamic Add, F»Phenyla!aninc, (XJIycine, H^Histidine, 
l=lsofeudne, K=Lyslnc, L»Leucioe, M^Metbionine, 
N-Asparagine, P=Proline, Q=Glutamine, R»Arginine, S^Scrine, 
T-Threonine, V«Vali ne, W-Tryptophan, Y«TyrosI ne, 
X«Un known, *=Stop codon, /^possible nucleotide deletion, 
Wpossible nucleotide insertion i 










LPWASRGVSPSASAWPEEKNYHQPAELNSSALRQ 

IAEGTSISEMWQNDLQPLLIERYPGSPGSYAARQ 

HIMQRIQRLQADWVLEXDTFLSQTPYGYRSFSNn I 

STLNPTAKRHLVl^CHYDSKYFSHWVNNRVFVG 

ATDSAVPCAMMLELARALDKKLLSLKTVSDSKP 

DLSLQLIFFDGEEAFLHWSPQDSLYGSRHLAAKM 

ASTPHPPGARGTSQLHGMDLLVLLDLIGAPNPTF 

PNFFPNSARWFERLQAIEHELHELGLLKDHSLEG 

RYFQNYSYGGV1QDDHIPFLRRGVPVLHLIPSPFP 

EVWHTNmDNEENLDESTIDNLNKELQVFVLEYL 

HL 


3514 


A 


1836 


513 


FKSLLSVKWFCFSILVLIFLGTRCYWEMTQSRPSP 
DPHRGRWEGGRSRPKGGEEGRJRRTRVPGLVTAS 
GPGNPLPDRLGEMAGGRHRRWGTLHLLLLVAA 
LPWASRGVSPSASA WPEEKNYHOPAILNSSALRO 

IAEGTSISEMWQNDLQPLLffiRYPGSPGSYAARQ 
HIMQRIQRLQADWVLEIDTFLSQTPYGYRSFSNI1 
STLNPTAKRHLVLACHYDSKYFSHWVNNRVFVG 
ATDSAVPCAMMLELARALDKKLLSLKTVSDSKP 
DLSLQLIFFDGEEAFLHWSPQDSLYGSRHLAAKM I 
ASTPHPPGARGTSQLHGMDLLVLLDLIGAPNPTF | 
FNFFPNSARWFERLOAIEHELHELGLIJCDHSLEG 1 
RYFQNYS YGG VIQDDHIPFLRRG VPVLHLEPSPFP 
E VWHTMDDN EENLDESTIDNLNKILQ VFVLE YL 
HL 


3515 


A 


114 


754 


LCRDLTTTMS SKRTKTKTKKRPQRATSNVFAMF 
DQSQIQEFKEAFNMIDQNRDGFIDKEDLHDMLAS 
LGKMPl'DEYLDAMMNEAPGPINFTMFLTMFGEK j 
LNGTDPEDVERNAFACFDEEATGTIQEDYLRELL j 
TT\MGDRF\TDE\EVDELYREAPI\DKKGGIFNYI\E 
FTRHLETGGPKDKDDRKITFQIPSPNVPWLATFG 
VFLEIFLLHGP f 


3516 


A 


1 


5169 


MAAAPSALLLLPPFPVLSTYRLQSRSRPSAPETDD 

SRVGGIMRGEKNYYFRGAAGDHGSCPTTTSPLA 

SALLMPSEAVSSSWSESGGGLSGGDEEDTRLLQL 

LRTARDPSEAFQALQAALPRRGGRLGFPRRJCEAL 

YRALGRVLVEGGSDEKRLCLQLLSDVLRGQGEA 

G QLEEAFSL ALLPQL W SLREENPALRKD ALQEL 

mCLKRSPGEVLRTLIQQGLESTDARLRASTALLL 

PBLLTTEDLLLGLDLTEVnSLARKLGDQETEEESE 

TAFSALQQIGERLGQDRFQSYISRLPSALRRHYN j 

RRLESQFG SQ VPYYLELEASGFPEDPLPC A VTLS 

NSNLKFGIIPQELHSRLLDQEDYKNRTQAVEELK 

QVLGKFNPSSTPHSSLVGFISLLYNLLDDSNFKW 

HGTLEVLHLLVIRLGEQVQQFLGPVIAASVKVLA 

DNKL\^QEYMKIFLKLMKEVGPQQVLCLLLEH 

LKHKHSRVREEVVNICICSLLTYPSEDFDLPKLSF 

DLAPALVDSKRRVRQAALEAFAVLASSMGSGKT 

SILFKAVDTVELQDNGDGVMNAVQARLARKTLP 

RLTEQGFVEYAVLMPSSAGGRSNHLAHGADTD 

WLLAGNRTQSAHCHCGDHVRDSMHIYGSYSPTI 

CTRRVLSAGKGKNKLPWENEQPGIMGENQTSTS 

KJDIEQFSTYDFIPSABCLKLSQGMPVNDDLCFSRK 

RVSRNLFQNSRDFNPDCLPLCAAGTTGTHQTNLS I 

GKCAQLGFSQICGKTGSVGSDLQFLGTTSSHQEK | 
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SEQID 
NO: 


Method 


\ Predicted — 1 
1 beginning 
1 nucleotide 
1 location 
! corresponding 
1 to first amino 
1 acid residue of 
I peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, D=Aspartic Add, 
EXJIutaraic Acid, F=Phenylalanlne, G^GIydne, H~Histidine, 
I s Isolendne, K*=Lysine, L=Lcucine, M=Methionlnc, 
N^Asparagine, P=Pr oJ in e, Q=Glutamine 9 R=Arginine, S^Serine, 
T^Tnreonine, V«Vollne f W=Tryptophan, Y^Tyrosine, 
X^Unknown, *=Stop codon, A=possIble nudeotide deletion, 
V=possible nudeotide insertion 










VYASLNFGSKTQQTFGSQTECTSSNGQNPSPGAY 

ILPSYPVSSPRTSPKHTSPLIISPKKSQDNSVNFSNS 

WPLKSFEGLSKPKSHRRSLSAQKSSVDPTGRVNHG 

VENSQEKPPWQLTT AL\ VRSPS SRRGLNGTKP VPPI 

P\RGISLLPDKADLSTVGHKKKEPDDIWKCEKDS 

LPIDLSELNFKDKI)LDQEEMHSSLRSLR>JSAAKK 

RAKLSGSTSDLESPDSAMKLDLTMDSPSLSSSPNI 

NSYSESGVYSQESLTSSLSTTPQGKRIMSDIFPTFG 

SKPCPTRLSSAKKKISH1AEQSPSAGSSSNPQQISS 

FDFTTTKALSEDSVVVVGKGVFGSLSSAPATCSQ 

SV1SSVENGDTFSIKQSIEPPSG1YGRSVQQNISSYL 

DVENEKDAKVSISKSTYNKMRQKIOCEEKELFHN 

KDCEKKEKN S WERMRHTGTEKMA SE SETPTGAI 

SQYKERMPSVTHSPEIMDLSELRPFSKPEIALTEA 

LRlXADEDWEKKIEGLNFmClJVAFHSEILKrKL 

HETNF A VVQE VKNLRS G VSRAA WCLSDLFTYL 

KKSMDQELDTTVKVLLHKAGESNTFIREDVDKA 

LRAMVN>TVTPARAVVSLrNGGQRYYGRKMLFF 

MMCHPNreKMLEKYVPSKDLPYIKDSVRNLQQK 

GLGEIPLDTPSAKGRRSHTGS VGNTRSSSV SRDA 

FNSAERAVTEVREVTRKSVPRNSLESAEYLKLIT 

GLLNAKDFRDRINGIKQLLSDTENNQDLWGNTV 

KIFDAFKSRLHDSNSKVNLVAXETMHKMIPLLRD 

HI^PIINMLIPAIVDNNLNSKNPGIYAAATNVVQA 

LSQHVDNYLLLQPFCTKAQFLNGKAKQDMTEKL 

ADIVTELYQRKPHATEQKVLVVLWHLLGNMTN 

SGSLPGAGGNIRTATAKLSKALFAQMGQNLLNQ 

AASQPPHIKKSLEELLDMTILNEL 


3517 


A 


1449 


252 


QDLKPVLDREYLAIYLKMVFFTCNACGESVKKI 

QVEKHVSVCRNCECLSCIDCGKDFWGDDYKNH 

VKCISEDQKYGGKGY/EKVKTHKGD/ASKQQAW 

IQKISELDCXRPNVSPKVRELLEQISAFDNVPQVKK 

AKFQNWMKNSLKVHNESILDQVWNIFSEASNSE 

PVNKEQDQRPLHPVANPHAEISTKVPASKVKDA 

VEQQ G E VKKNKRERJCEERQ KXRKREKKELKLE 

NHQENSRKQKPKKRKKGQEADLEAGGEEVPEA 

NGSAGKRSKKKKQRKDSASEEEARVGAGKRKR 

RHSKVETDSKKKKMKLPEHPEGGEPEDDEAPAK 

GKFNWGTIKAILKQAPDNEITIKia.RKKVLAQY 

YTVTDEHHRSEEELLVIFNKKJSKNPTFXLLKDK 

VKLVK 


3518 


A 


3 


635 


APDSNARNDHFDACSLR VQAGLS SAGPALGNSG 

LAALMASPSKAVIVPGNGGGDVTraGWYGWVK 

K£LEKIPGF(^LAJKNMPDPITARESIWLPFMETEL 

HCDEKTHIGHSSGAIAAMRYAETHRVYAIVLVSA 

YTSDLGDENERA SG Y FI RPWQ WEKIKANCP YTV 

QFGSTDDPFLPWKEQQEVADNSWKPNCTNSLTV 

ATFRTQSFMN 


3519 


A 


81 


2277 


VRETRREMAMAMSDSGASRLRRQLESGGFEARL 

YVKQLSQQSDGDRDLQEHRQRIQALAEETAQNL 

KKNVYQNYRQFIETAREISYLESEMYQLSHLLTE 

QKSSLESIPLTLLPAAAAAGAAAASG GEEG VGG A 

GGRDHLRGQAGFFSTPGGASRDGSGPGEEGKQR 

TLTTLLEKVEGCRHLLETPG Q YLV YNGDLVE YD 

ADHMAQLQRVHGFLMNDCLLVATWLPQRRGM 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


| Predicted end 

nucleotide 

location 
1 corresponding 

to last amino 

add residue of 

peptide 

sequence 


Amino acid sequence (A«Alanine O Cysteine, D=Aspartfc Arid, 
E=Glutaraic Acid. F=Phenylalenlne, G=Glycine, H-Histidlne, 
I-lsoleucinc, K^Lysine, U=Leucine, M=Methloninc f 
N»Asparagine, PHProline, Q=Glutamine, R=Arginine, S=S*rine, 
TKThrconlne, V»Valine, W=Tryptophan, Y=Tyrosine, 
X"*Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 








1 


YRYNALYSLDGLAVVNVKDNPPMKDMFKLLMF 
PENRIFQAENAKIKJREWLEVLEDTKRALSEKRIIR 
" EQEE AAAPRGPPQVTSKATNPFEDDEEEEPA VPE 
VEEEKVDLSMEWIQELPEDLDVCIAQRDFEGAV 
DLLDKLNHYLEDKPSPPPVKELRAKVEERVRQL 
TE VLVFELSPDRSLRGGPKATRRA VSQLIRLGQC 
TKACELFLRNRAAAVHTAJRQLRJEGATX.LYIHK 
LCHVFFTSLLETAREFEBDFAGTDS GCYS AFVVW 
ARSAMGMFVDAFSKQVFDSKESLSTAAECVKVA 
KEHCQQLGDIGLDLTFIIHALLVKDIQGALHSYK 
EIIIEATXHRNSEEMWRRMNLMTPEALGKLKEE 
MKSCGVSNFEQYTGDDCWVNLSYTVVAFTKQT 
MGFLEEALKLYFPELHMVLLESLVE1ILVAVQHV 
DYSLRCEQDPEKKAFIRQNASFLYETVLXPWEK 
RFEEGVGKPAKQLQDLRNASRLIRVNPESTTSVV 


3520 


A 


1706 


540 


FVAHlJVWPWRADGDMEDGVLNEGFLVKRGHIV 

HNWKARWnLRQNTLVYYKLEGGRRVTPPKGRI 

LLDGCTTTCPCLEYENRPLLIKLKTQTSTEYFLEA 

CSREE/RRDA WAFEMTGAIHAG QA RGKVQQLHS 

LRNSFKLPPHISLHRJVDKMHDSNTGIRSSPNMEQ 

GSTYKXTFLGSSLVDWLISNSFTASRLEAVTLAS 

MLMEENFLRPVGVRSMGAIRSGDLAEQFJLDDST 

ALYTFAJESYKKKISPKEEISLSTVELSGTVVKQGY 

LAKC^HKRKNWKVRRFVLRKDPAFLHYYDPSK 

EENRPVGGFSLRGSLVSALEDNGVPTGVKGNVQ 

GNLFKVITK\DDTHYYIQA\SSKAE\RAE\WIGSLS 

KSLNMNKDPEGTPD SLPSLPR 


3521 


A 


3 


[3063 


HASVSLSLGCPRPCADTPGPQPQPMDLRVGQRPP 

VEPPPEPTLLALQRPQRLHHHLFLAGLQQQRSVE 

PMRVKMELPACGATLSLVPSLPAFSIPRHQSQSST 

PCPFLGCRPCPQLSMDTPMPELQEAPQEQELRQL 

LHKDKSKRSAVASSVVKQKLAEVILKKQQAALE 

RTVHPNSPGIPYRTLEPLETEGATRSMLSSFLPPV 

PSLPSDPPEHTPLRKTVSEPNLKLRYKPKKSLERR 

KNPLLRKESAPPSLRRRPAETLGDSSPSSSSTPAS 

GCSSPNDSEHGPNPILGSEALLGQRLRLQETSVAP 

FALPTVSLLPAITLGLPAPARADSDRRTHPTLGPR 

GPILGSPHTPLFLPHGLEPEAGGTLPSRLQPILLLD 

PSGSHAPLLTVPGLGPLPFHFAQSLMTTERLSGSG 

LHWPLSRTRSEPLPPSATAPPPPGPMQPRLEQLKT 

HVQVDCRSAKPSEKPRLRQIPSAEDLETDGGGPG 

QWDDGLEHRELGHGQPEARGPAPLQQHPQVLL 

WEQQRLAGRLPRGSTGDTVLLPLAQGGHRPLSR 

AQSSP AAPASLS APEP ASQARVLSS SETPARTLPF 

TTGLIYDSVMLKHQCSCGDNSRHPEHAGRIQSIW 

SRLQERGLRSQCECLRGRKASLEELQSVHSERHV 

LLYGTNPLSRLKLDNGKLAGLLAQRMFVMLPCG 

GVGVDTDTIWNELHSSNAARWAAGSVTDLAFK 

VASRELKNGFAWRPPGHHADHSTAMGFCFFNS 

VAIACRQLQQQSKASKIL1VDWDVHHGNGTQQT 

FYQDPSVLYISLHRHDDGNFFPGSGAVDEVGAGS 

GEGFNVNVAWAGGLDPPMGDPEYLAAFRIWM 

PIAREFSPDLVLVSAGFDAAEGHPAPLGGYHVSA 

KCTGYMTQQLM>ILAGGAVVLALEGGHDLTAIC 

DASEACVAAXLGNRVDPLSEEGWKQKPNLNAIR 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide ( 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A»Alanlne OCysteine, B=Aspartic Add, 
E=Glutomic Add, F=Phenyl alanine, OGIydoe, H=Histidine, 
I=IsoJcuclne, K-Lysine, L^Leudne, M»Methlonine, 
N=»Asparaglne, P"Proline, Q=Glutamloc, R«Arginine, S=Serine, 
TaThreonioe, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop cod on, /=possib!e nucleotide deletion, 
^possible nudeotide insertion 










SLEA\VIRVHSKYWGCMQRLASCPDSWVPRVPG 
ADKEEVEAVTALASLSVGILAEDRPSEQLVEEEE 
PMNL 


3522 


A 


9 


602 


KMAALGEPVRLERDICRAIELLEKLQRSGEVPPQ 

KLQALQRVLQSEFCNAVREVYEHVYETVDISSSP 

EVRANATAKATVAAFAASEGHSHPRVVELPKTE 

EGLGFNIMGGKEQNSPIYISRnP/GGlADRHGGLK 

RGDQLLSVNGVSVEGEHHEKAVELLKAAQGKV 

KLVVRYTPKVLEEMESRFEKMRSAKRRQQT 


3523 


A 


645 


1465 


1MAETSLLEAGASAASTAAALENLQVEASCSVCL 

EYLKEPVIECGHNFCKACITRWWEDLERDFPCP 

VCRKTSRYRSLRFNRQLGSMVEIAKQIARPSSGRS 

GMRASAPQHHEALSLFCYEDQEAVCLICAISHTH 

RAHTVVPLDDATQEYKEKLQKCLEAXLNQKLQBI 

TRCKS SEEKKPGELKRL VESRRQQILREFEELHRR 

LDEEQQVLLSRLEEEEQDILQRLRENAAHLGDKR 

RDLAHLAAEVEGKCLQSGFEMLKVRPLPLHSPS 

G 


3524 


A 


3 


698 


PMVRHEAGEALGAIGDPEVLEILKQYSSDPVIEV 

AETCQLAVRRLEWLQQHGGEPAAGPYLSVDPAP 

P AEER\D V GRLRE ALLDESRPLFERYRAMF A LRN 

AGGEEAALALAEGLHCGSALFRHEVGYVLGQLQ 

HEAAVPQLAAALARCTENPMVRHECAEALGAIA 

RPACLAALQAHADDPERVVREVSCKVALDMYEH 

ETGRAFQYADGLEQLRGAPSLGPNPHPELPEDS 


3525 


A 


1452 

m 


694 


EGLQRPEYLVASAAGFQGLAWGGEGRGRAGCS 

SSGFRDAEPLLLSCPGRNEPLKKERLKWKSDYP 

MTDGQLRSKRDEFWDTAPAFEGRKEIWDALKA 

AAYAAEANDHELAQAILDGASITLPHGTLCECY 

DELGNRYQLPIYCLSPPVNLLLEHTEEESLEPPEP 

PPSVRREFPLKVRLSTGKDVRLSASLPDTVGQLK 

RQLHAQE/GTPKPS WQRWFFS GKLLTDRTRLQET 

KIQKDFVIQVIINQPPPPQD 


3526 


A 


123 


3441 


PGNEGLGLAADHNEDLGHLSADAPWPAVTMAP 

RKRSHHGLGFLCCFGGSDIPEINLRDNHPLQFME 

FSSPIPNAEELNIRFAELVDELDLTDKNREAMFAL 

PPEKKWQIYCSKKKEQEDPNKLATSWPDYYIDRI 

NSMAAMQSLYAFDEEETEMRNQWEDLKTALR 

TQPMRFVTRFIELEGLTCLLNFLRSMDHATCESRI 

HTSLIGCIIALMNNSQGRAHVLAQPEAISTIAQSL 

RTENSKTKVAVLEILGAVCLVPGGHKKVLQAML 

HYQVYAAERTRFQTLLNELDRSLGRYRDEVNLK 

TATMSFINAVLNAGAGEDNLEFRLHLRYEFLMLG 

IQPVIDKLRQHENAILDKHLDFFEMVRNEDDLEL 

ARRFDMVHIDTKSASQMFELIHKKLKYTCAYPC 

LLSVLHHCLQMPYKRNGGYFQQWQLLDRILQQI 

VLQDERGVDPDLAPLENFKVKN1VNMLINENEV 

KQWRDQAEKFRKEHMELVSRLERKERECETKTL 

EKEKMMRTVLNKMKDKLARESQELRQARGQVA 

ELVAQLSELSTGPVSSPPPPGGPLTLSSSMTTNDL 

PPPPPPLPFACCPPPPPPPLPPGGPPTPPGAPPCLG 

MGLPLPQDPYPSSDVPLRKKRVPQPSHPLKSFNW 

VKLNEERWGTVWNEIDDMQWRI1JDLEDFEKM 

FSAYQRHQELITNPSQQKELGSTEDIYLASRKVK 

ELSVIDGRRAQNCIILLSKLKLSNEEIRQAILKMD 
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SEQID 
NO: 


Method 

« 


Predicted 

beginoiog 

nucleotide 

location 

corresponding 

to first amioo 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
"sequence 


Amino acid sequence (A=AIanine OCysteine, D=Aspartic Add, 
E=Glutamic Acid, ^-Phenylalanine, G-Giycine, H-Histtdine, 
Msoleucine, K=Lysf ne t L»Leucine f M=M ethionine, 
N=Asparagine, P^ProItne, Q^Glutamine, R=Arginine, S=Serine, 
T=»Threoninc, V-Valine, W*^Tryptophan, Y»Tyrosioe, 
X«Unknown, *=Stop codoo, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










EQEDLAKDMLEQIXKFIPEKSDIDLLEEHKHEIER 

MARADRFLYEMSRIDHYQQRLQALFFKKKFQER 

LAEAKPKVEAILLASRELVRSKRLRQMLEVELAI 

GNFMNKGQRGGAYGFRVASLNKIADTKSSIDRN 

ISLLHYLIMILEKHFPDILNMPSELQHLPEAAKVN 

LAELEKEVGNLRRGLRAVEVELEYQRRQVREPS 

DKFVPVMSDFITVSSFSFSELEDQLNEARDKFAK 

ALMHFGEHDSKMQPDEFFGBFDTFLQAFSEARQD 

LEAMRIUOCEEEEIUIARMEAMLKEQRERERWQR 

QRKVLAAG SSLEEGGEFDDLVS ALRSGEVFDKD 

LCKLKRSRKRS G S Q ALE VTRERAINRLNY 


3527 


A 


1445 


714 


LLGTRMLAGQLEARDPKEGTHPEDPCPGAGAV 

MEKTA VAAEVL l'EUCNTGEMPPLQQQIIRLHQE ! 

LGRQKSLWADVHGKLRSHIDALRBQNMELREKL 

RALQLQRWKARKKSAASPHAGQESHTLALEPAF 

GKISPLSADEETIPKYAGHKNVQSGHSSWGQRSSS 

NNSAPPKPMSLKIERISSWKTPPQENRDKNLSRR 

RQDRRATPTGRPTPCAERRGWSEDGKVASDTCV 

TLHWPLGKFRFR 


3528 


A 


484 


1777 


RISKIQVYYSTGYSSRKMNPTLGLAIFLAVLLTVK 

GLLKPSFSPRNYKALSEVQGWKQRMAAKELAR 

QNMDLGFKLLKKL AFYNPGRNIFLSPLS ISTAFS 

MLCLGAQDSTLDEIKQGFNFRKMPEKDLHEGFH 

YirHELTQKTQDLKJLSIGNTLFIDQRLQPQRKFLE 

DAKNFYSAEmTWQNLEMAQKQINDFI/ESKTH 

GKJNNLIE>nDPGTVMLLA>rYlFFRARWKro 

NVTECEEDFFLEKNSSVKVPMMFRSGIYQVGYDD 

KLSC11LEIPYQKNTTAIFILPDEGKLKHLEKGLQV 

DTFSRWKTLLSRRWDVSVPRLHMTGTFDLKKT 

LSYIGVSKJFEEHGDLTKIAPHRSLKVGEAVNKA 

ELKMDERGTEGAAGTGAQTLPMETPLVVKIDKP 

YLLLIYSEKJOPSVLFLGKIVNPIGK 


3529 


A 


1 


5684 


VSSVSHENPTEVFEDGENPPSSRSSESGFTEFIQY 

QADRTODIDRELSEGQGAAAIPIGSTSSETETAST 

VGSEETnQTPSWTQGTATRSRKTAQKTAMQCC 

LEYVQQFLTRLINLYHQNNSFSQSLATEHQGDLG 

REQGETSKWDRNSQGDVKEKNISKQKTSKEYLS 

AFLAACQLFLECSSFPVYIAEGNHTSELRSEKLET 

DCEHVQPPQWLQTLMNACSQASDFSVQSVAISL 

VMDLVGLTQSVAMVTGENINSVEPAQPLSPNQG 

RVAVVIRPPLTQGNLRYIAEK'IEFFKHVALTLWD 

QLGDGTPQHHQKSVELFYQLHNLVPSSSICEDVI 

SQQLTHKDKKIRMEAHAKJFAVLWHLTRDLHINK 

SSSFVRSFDRSLFIMLDSLNSLDGSTSSVGQAWL 

NQVLQRHDIARVLEPLLLLLLHPKTQRVSVQRV 

QAERYWNKSPCYPGEESDKHFMQNFACSNVSQ 

VQLITSKGNGEKPLTMDEIENFSLTVNPLSDRLSL 

LSTSSETEPMWSDFDLPDQQIEILQSSDSGCSQSS 

AGDNLSYEVDPETVNAQEDSQMPKESSPDDDVQ ! 

QVVFDL1CKVVSGLEVESASVTSQLEIEAMPPKC 

SDIDPDEET1KIEDDSIQQSQNALLSNESSQFLSVS 

AEGGHECVANGISRNSSSPCISGTTHTLHDSSVAS 

IETKSRQRSHSSIQFSFKEKLSEKVSEKETTVKESG 

KQPG AKPK VKLARKKDDDKKKS SNEKLKQTS V 

FFSDGLDLENWYSCGEGDISEIESDMGSPGSRKSP | 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D=Aspartic Add, 
E=G)Dtnmic Acid, F=Phenylalanine, G=€lycinc, H=Hhtidine, 
I»Isoleucine, K=Lysine, L^Leucine, M»Methionine, 
N=>Asparagine, P=Profine t Q^Glutamine, R^Arginlne, S^Serfoe, 
T*>Threonlne, V«Valine f W»Tryptophan, Y=*Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=T)OsaibIe nucleotide insertion 




• 






NFNIHPLYQHVLLYLQLYD SSRTLYAPSA1KAILK 

TWIAFVNAISTTSVNNAYTPQLSLLQNIXARHRI 

SVMGKDFYSHIPVDSNHNFRSSMYIEILISLCLYY 

MRSHYPTHVK VTA QDLIGNRNMQMMSIEILTLL 

FTELAKVTES S AKGFPSFISDMLSKCKVQKVTLHC 1 

LLSSIFSAQKWHSEKMAGKNLVAVEEGFSEDSLI 

NFSEDEFDNGSTLQSQLLKVLQRLIWLEHRVMNT 

IPEEVNETGFDF VVSXDLEHI SPHQPMTSLQ YLHAQ 

SITCQGMFLCAVIRANLHQHCACKMHPQWIGLIT 

STLPYMGKVLQRVWSVTLQLCRNLDNUQQYK 

YETGLSDSRPLWMASIIPPDMILTLLEGITAIIHYC 

LLPPTTQYHQLLVSVDQKHLFEARSGELSILHMI 

MSSVTLLWSILHQADSSEKMTIAASASL I T1NLG 

ATKNLRQQILELLGPI SMNHG VHFMAA1AFV WN 

ERRQNK'ITTRTK VIPAA SEEQLLLVELVRSIS VM 

RAETVIQWKEVIJCQPPA1AKDKKHLSLEVCML 

QFFYAYIQRIPVPNLVDSWASLLILLKDSIQLSLP 

APGQFLILGVLNEFIMKNPSLENKKDQRDLQDVT 

HKIVDAIGAIAGSSLEQTTWLRKNLEVKPSPKIM 

VDGTNLESDVEDMLSPAMETANITPSVYSVHAL 

TLI^EVLAHLLDMVFYSDEKERVIPLLVNIMHYV 

VPYLRNHSAHNAPSYRACVQLLSSLSGYQYTRR 

AWKKEAFDLFMDPSFFQMDASCVNHWRAIMDN 

LMTHDKTTFRDLMTRV AVAQS SSLNLFANRDVE 

LEQRAMLLKRLAFAIFSSEIDQYQKYLPDIQERLV 

ESLRLPQVPTLHSQVFLFFRVLLLRMSPQHLTSL 

AVPTM1TELVQVFLLMEQELTADEDISRTSGPSVA 

GLETTYTGGNGFSTSYNSQRWLNLYLSACKFLD 

LALALPSENLPQFQMYRWAFIPEASDDSGLEVRR 

QGIHQREFKPYVVRl^KLLRKRAKKNPEEDNSG 

RTLGWEPGHLLLTICTVRSMEQLLPFFNVLSQVF 

NSKVTSRCGGHSGSPILYSNAFPNKDMKLENHKP 

CSSKARQKIEEMVEKDFLEGMDCT 


3530 


A 


1 

» 


5684 


VSSVSHENPTEVFEDGENPPSSRSSESGFTEFIQY 

QADRTDDIDRELSEGQGAAAEPIGSTSSETBTAST 

VGSEETIIQTPSVVTQGTATRSRKTAQKTAMQCC 

LEYVQQFLTRLINLYIIQNNSFSQSLATEHQGDLG 

REQGETSKWDRNSQGDVKEKNISKQKTSKEYLS 

AFLAACQLFLECSSFPVY1AEGNHTSELRSEKJLET 

DCEHVQPPQWLQTLMNACSQASDFSVQSVAJSL 

VMDLVGLTQSVAMVTGENINSVEPAQPLSPNQG 

RVAWIRPPLTQGN1JR.YIAEKTEFFKHVALTLWD 

QLGDGTPQHHQKSVELFYQLHNLVPSSSICEDVI 

SQQLTHKDKKIRMEAHAKFAVLWHLTRDIJnNK 

SSSFVRSFDRSLFIMLDSLNSLDGSTSSVGQAWL 

NQVLQRHDIARVLEPLLLLLLHPKTQRVSVQRV 

QAERYWNKSPCYPGEESDKHFMQNFACSNVSQ 

VQLITSKGNGEKPLTMDEIENFSLTVNPLSDRLSL 

LSTSSETIPMVVSDFDLPDQQIEILQSSDSGCSQSS 

AGDNLSYEVDPETVNAQEDSQMPKESSPDDDVQ 

QVVFDLICKWSGLEVESASVTSQLEIEAMPPKC 

SDIDPDEETTKIEDDSIQQSQNALLSNESSQFLSVS 

AEGGHECVANGISRNSSSPCISGTTHTLHDSSVAS 

IETKSRQRSHSSIQFSFKEKLSEKVSEKETTVKESG 

KQPGAKPKVKLARKKDDDKKKS SNEKLKQTS V 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
■location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, D^Aspartfc Acid, 
E-Glotamic Acid, ^Phenylalanine, G^Glydnc, HaHistidlne, 
l»Isoteudne, K»Lysine, L^Leucine, M=Mcthionioe, 
N^vlsparagine, P=Prolinc, Q=Glutamine, R=Arginioe, S=Scrine, 
T«Threonine, V=Valine, W«Tryptophan, Y«Tyroslne, 
X^Unknown, *«Stop codon, /^possible nudeotide ddetion, 
V=possible nudeotide insertion 










FFSDGLDLENWYSCGEGDISEIESDMGSPGSRKSP 

NFNIHPLYQHVLLYLQLYDSSRTLYAFSAIKAILK 

ThnPIAFWAJSTTSVNNAYTPQLSLLQNLLARHRI 

SVMGKDFYSHIPVDSNHNFRSSMYIEILISLCLYY 

MRSHYPTHVKVTAQDLIGNRNMQMMSIEILTLL 

FTELAKVIESSAKGFPSFISDMLSKCKVQKVILHC 

LLSSIFSAQKWHSEKMAGKNLVAVEEGFSEDSLI 

NFSEDEFDNGSTLQSQLLKVLQRLIV\LEHRVM\T 

IPEEVNETGFDFWSVDLEfflSPHQPMTSLQYLHAQ 

SrrCQGMFLCAVIRA\LHQHCACKMHPQWlGLIT 

STLPYMGKVLQRVWSVTLQLCKNLDNLIQQYK 

YETGLSDSRPLWMASIIPPDMILTLLEGITAIIHYC 

LLDPTTQYHQLLVSVDQKHLFEARSGILSILHMI 

MSSVTLLWSILHQADSSEKMTIAASASLTTINLG 

ATKNLRQQIIJELIXjPISNINHGVHFMAAIAFVWN 

BRRQNKTITRTKVIPAASEEQLLLVELVRSISVM 

RAETVIQTVKEVLKQPPAIAKJDKKHLSLEVCML 

QFFYA YIQRIPVFNL VDS WAS LLILLKDSIQLSLP 

APGQFL1LGVLNEFIMKNPSLENKKDQRDLQDVT 

HKIVDAIGAIAGSSLEQTTWLRRNLEVKPSPKIM 

VDGTh^ESDVEDMI^PAMETANlTPSVYSVHAL 

TLLSEVLAHLLDMVFYSDEKERVIPLLVNIMHYV 

VPYLRNHSAHNAPSYRACVQLLSSLSGYQYTRR 

AWKKEAFDLFMDPSFFOMDASCVNHWRAIMDN 

A m WW iml»Mi»A ITU/A A. A W ▼ 1 ~* A TT l\#all*Ai^Al 

LMTHDKTTFRDLMTRVAVAQSSSLNLFANRDVE 

LEQRAMLLKRLAFAIFSSEIDQYQKYLPDIQERLV 

ESLRLPQVPTLHSQVFLFFRVLLLRMSPQHLTSL 

WPTMITELVQVFLLMEQELTADED1SRTSGPSVA 

GLLT1YTGGNGFSTSYNSQRWLNLYLSACKFLD 

LALALPSENLPQFQMYRWAFIPEASDDSGLEVRR 

QGfflQREFKPYVVRJLAKLLRKRAKKNPEEDNSG 

RTLGWEPGHLLLTICTVRSMEQLLPFFNVLSQVF 

NSKVTSRCGGHSGSPILYSNAFPNKDMKLENHKP 

CSSKARQKIEEMVEKDFLEGMIKT 


3531 


A 


553 


2470 

* 


LISPSPALSSQDPALSLKENLEDISGWGLPEARSK 

ESVSFKDVAVDFTQEEWGQLDSPQRALYRDVM 

LENYQNLLALGPPLHKPDVISHLERGEEPWSMQ 

REVPRGPCPEWELKAVPSQQQGICKEEPAQEPIM 

ERPLGGAQAWGRQAGALQRSQAAP\GR\RTCHG 

LGRPWEEFPLRCPLFAQQRVPEGGPLLDTOKNV 

QATEGRTKAPARLCAGENASTPSEPEKFPQVRRQ 

RGAGAGEGEFVCGECGKAFRQSSSLTLHRRWHS 

REKAYKCDECGKAFTWSTNLLEHRRIHTGEKPFF 

CGECGKAFSCHSSLNVHQRIHTGERPYKCSACEK 

AFSCSSLLSMHLRVHTGEKPYRCGECGKAFNQR 

THLTRHHRMTGEKP YQCGSCGKAFTCHS SLTVH 

EKfflSGDKPFKCSDCEKAFNSRSRLTLHQRTHTG 

EKPFKCADCGKGFSCHAYLLVHRRIHSGEKPFKC 

NECGKAFSSHAYLIVHRRIHTGEKPFDCSQCWKA 

FSCHSSLIVHQRIHTGEKPYKCSECGRAFSQNHCL 

IKHQKIHSGEKSFKCEKCGEMFNWSSHLTEHQRJL 

HSEGKPUUQFNKHLLSTYYVPGSLLGAGDAGLR 

DVDPIDALDVAKLLCVVPPRAGRNFSLGSKPRN 


3532 


A 


3931 


317 


HRELQDSPSAEPPAGSMPLRHWGMARGSKPVGD 
GAQPMAAMGGLKVLLHWAGPGGGEPWVTFSES 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine C=Cysteine, D=»Aspartic Acid, 
&=Glutamic Add, ^Phenylalanine, G=Glycine, H-Hiatidint, 
l=Isoieudne, K^Lysinc, L=Lcucine, M=Mettiionine t 
N=»Asparagine,P=Proline, Q=Glutomine, R=Arginine, S=Serine, 
T=»Threonine, V«Valine, W=Tryptophan, Y-Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 










SLTAEEVCIHIAHKVGITPPCFNLFALFDAQAQV 

WLPFNHILEIPRDASLMLYF\RHRFYSR\NWHGM ! 

NPREPAVYRCGPPGTEASSDQTAQGMQLLDPAS 

FEYLFEQGKHEFVNDVASLWELSTEEEIHHFKNE 

SLGMAFLHLCHLALRHGEPLEEVAKKTSFKDCIP 

RSFRRHIRQHSALTRLRLIWVFRRFLRDFQPGRLS 

QQMVMVKYLATLERLAPRFGTERVPVCHLRLLA 

QAEGEPCYIRDSGVAPTDPGPESAAGPPTHEVLV 

TGTGGIQWWPVEEEVNKEEGSSGSSGRNPQASL 

FGKKAKAHKAFGQPADRPREPLGAYFCDFRDIT 

HVGLKEHCVSMRQDNKCLELSLPSRAAALSFVS 

LVDGYFRLTADSSHYLCHEVAPPRLVMSIRDG1H 

GPLLEPFVQAKLRPEDGLYLIHWSTSHPYRJLILTV 

AQRSQAPDGMQSLRLRKFPIEQQDGAFVLEGWG 

RSFPSVRELGAALQGCLLRAGDDCFSLRRCCLPQ 

PGETSNLHMRGARASPRTLNLS QLSFHRVDQKEI 

TQLSHLGQGTRTNVYEGRLRVEGSGDPEEGKMD 

DEDPLVPGRDRGQELRVVLKVLDPSHHDIALAF 

YETASLMSQVSHTHLAFVHGVCVRGPENIMVTE 

YVEHGPLDVWLRRERGHVPMAWKMVVAQQLA 

SALSYLENKNLVHGNVCGRNILLARLGLAEGTSP 

FIKLSDPGVGLGALSREERVERIPWLAPECLPGG 

ANSLSTAMDKWGFGATLLEICFDGEAPLQSRSPS 

EKEHFYQRQHRLPEPSCPQLATLTSQCLTYEPTQ 

RPSFRTILRDLTRLQPHNLADVLTYNPDSPASDPT 

VFHKRYLKKIRDLGEGHFGKVSLYCYDPTNDGT 

GEMVAVKALKADCGPQHRSGWKQEIDILRTLYH 

EHUKYKGCCEDQGEKSLQLVMEYVPLGSLRDYL 

PRHSIGLAQLLLFAQQICEGMAYLHAQHYIHRDL 

AARNVLLDKDRLVKIGDFGLAKAVPEGHEYYRV 

REDGDSPVFWYAPECLKEYKFYYASDVWSFGVT 

LYELLTHCDSSQSPPTKFLELIGIAQGQMTVLRLT 

F,T J .F.RGERJLPRPDKCPCEVYHLMKNC WETEASF 

RPTFENLIPILKTVHEKYQGQAPSVFSVC 


3533 


A 


182 

m 


3465 


FRWLDFFRG SINSQFEFGRKKENMTSP AKFKKDK 

EIIAEYDTQVKEBRAQLTEQMKCLDQQCELRVQL 

LQDLQDFFRKKAEIEMDYSRNLEKLAERFLAKT 

RSTKDQQrTCKDQNVLSPVNCWNLLLNQVKRES 

RDH1TLSDIYLNNIIPRFVQVSEDSGRLFKKSKEV 

GQQLQDDLMKVLNELYSVMKTYHMYNADSISA 

QSKLKEAEKQEEKQIGKSVKQEDRQTPRSPDSTA 

N\0UEEKHVRRSSVKKIEKMKJ2KRQAKYTENKL 

KAIKARNEYLLALEATNASVFKYYIHDLSDLIDQ 

CCDLGYHASLNRALRTFLSAELNLEQSKHEGLD 

AIENAVENLDATSDKQRIJ^MYNNWCPPMKFE 

FQPHMGDMASQLCAQQPVQSELLQRCLQLQSRL 

STLKJE>mEVKKTMEATLQTIQDIVTVEDFDVSD 

CFQYSNSMESVKSTVSETFMSKPSIAKRRANQQE 

TEQFYFTKMKEYLEGR>nL.lTKLQAKHDLLQKTL 

GESQRTDCSLARRSSTVRKQDSSQAIPLWESCIR 

FISRHGLQHEGIFRVSGSQVEVNDIKNAFERGEDP 

1AGDQNDHDMDSLAGVIJCLYFRGLEHPLFPKDIF 

HDLMACVTMDNLQERALHIRKVLLVLPKTTLII 

MRYLFAFLNHLSQFSEENMMDPYNLAICFGPSL 

MSVPEGHDQVSCQAHVN^DCTniQHENIFPSPRE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AlanJne 0= Cysteine, D=>Aspartic Acid, 
E=Gtutnmic Add, F=Phenyla!anine, G^Glycine, HHRGstidine, 
I»Isoleudne, K=Ly«ine, LHLeuetne, M=Methionine, 
N^Asparagine, P=Proiine, Q=Glutamine, R=»Arginine, S=SeHne, 
T=Threonine, V«Valine, W«Tryptophan, Y^Tyrosinc, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possiWe nucleotide insertion 










LEGPVYSRGOSMEDYCDSPHGETTSVEDSTQDV 

TAEHHTSDDECEPEEAIAKFDYVGRTARELSFKK 

GASLLLYQRASDDWWEGRHNGIDGLIPHQYIVV 

QDTEDG V VERS SPKSEIEVISEPPEEKVTARAGAS 

CPSGGHVADIYLAhTO^QRKRPESGSIRKTFRSDS 

HGLSSSLTDSSSPGVGASCRPSSQPIMSQSLPKEG 

PDKCSISGHGSLNSISRHSSLKNRLDSPQIRKTAT 

AGRSKSFDNHRPMDPEVIAQDIEATMNSALKELR 

ELERQSSVKHTPDWLDTLEPLKTSPWAPTSEPS 

SPLHTQLLKDPEPAFQRSASTAGDIACAFRPVKS 

VKMAAPVKPPATVRPKPTWFPKTNATSPGVNSST 

SPQSTDKSCTV 


3534 


A 


1 

• 


2640 


FRRFVCPASRRPAAGLRDAASSAPRGMASEGPRE 

PESEGLKLS ADVKPFVPRFAGLNVA WLESSEAC V 

FPSSAATYYPFVQEPPVTEQKIYTEDMAFGASTFP 

PQYLSSEITLHPYAYSPYTLDSTQNVYSVPGSQY 

LYNQPSCYRGFQTVKHRNENTCPLPQEMKALFK 

KKTYDEKKTYDQQKFDSERADGTISSEIKSARGS 

HHLSIYAENSUCSDGYHKRTDRKSRIIAKNVSTS 

KPEFEFTTLDFPELQGAENNMSEIQKQPKWGPVH 

SVSTDISLLREWKPAAVLSKGEIWKNNPNESV 

TANAATNSPSCTRELSWTPMGYVVRQTLSTELS 

AAPKKVTSMI^KTIASSADPKNVSIPSSEALSSD 

PSYNKEKHIIHPTQKSKASQGSDLEQNEASRKNK 

KKKEKSTSKYEVLTVQEPPRJEDAEEFPNLAVAS 

ERRDR1ETPKFQSKQQPQDNFKNNVKKSQLPVQL 

DLGGMLTALEKKQHSQHAKQSSKPVWSVGAV 

PVLSKECASGERGRRMSQMKTPHNPLDSSAPLM 

KKGKQREJDPKAKKPTSLKKULKERQERKQRLQE 

NAVSPAFTSDDTQDGESGGDDQFPEQAELSGPEG 

MDELISTPSVEDKSEEPPGTELQRDTEASHLAPN 

H 1" 1FPKIHSRRFRDYCSQMLSKEVDACVTD1XKE 

LVRFQDRMYQKDPVKAKTKRRLVLGLREVLKH 

LKl^KKLKCVIISPNCEKIQSKGGLDDTLHTIIDYA 

CEQNIPFWALNRKALGRSLNKAVPVSVVGIFSY 

DGAQDQFHKMVELTVAARQAYKTMLENVQQE 

LVGEP\SLRHLPAYPHRAPAALQKMAPQP/VKEK 

EEPHYIEIWKKHLEAYSGCTLELEESLEASTSQM 

MNLNL 


3535 

• 


A 


1747 


983 


LFQFQVCRS VLSPRAAGCrW SL APRSRGAAG SPR 1 

RYRGPQPQPAPPSALPNSRPSPVASGREMVVLSV 

PAEVTVIIXDIEGTTTPIAFVKDIIJTYIEENVKEY 

LQTHWEEEECQQDVSLLRKQVVFADVVPAVRKW 

REAGMKVYIYSSGSVEAQKLLFGHSTEGDILELV 

DGHFDTKIGHKVESESYRKIADSIGCSTNNDLFLT 

DVTREASAAEEADVHVAVWRPGNAGLTDDEK 

TYYS LITSFSEL YLPSST 


3536. 


A 


3 


1302 


GRPPTAPHTGRPPTANRGDPRLDLKRGCARLLTS 

IESRGRPAASAGLRRDRCALRRWPLRRAPLARAT 

RRRAGSPRRCAPRPRACPQGWSRARHQPGGLCL 

LLLLLCQFMEDRSAQAGNCWLRQAKNGRCQVL 

YKTELSKEECCSTGRLSTSWTEEDVNDNTLFKW 

MIFNGGAPNCIPCKETCENVDCGPGKKCRMNKK 

NKPRCVCAPDCSNTTWKGPVCGLDGKTYRNECA 

LLKARCKEQPELEVQYQGRCKKTCRDVFCPGSS 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 

Inrnfinn 

corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A^Alanine OCysteine, D=>Aspartic Add, 
^-Glutamic Add, ^Phenylalanine, G=Gtydne, H-Xflstidine, 

f=sT«nt^ii^n^ RaT ,vsl nt_ T #nrill£ TVf a IVl£iflinnlnp 

N-Asparagine, P— Proline, Q=Glut amine, R—Arginine, S=Serine, 
T^^Thrconine, V-Valine, W»Tryptophan, Y-Tyrosine, 
X"Unknown» *«^Stop cod on, /"possible nucleotide deletion, 
\=ppssible nucleotide insertion 










TCVWDQTNNAYCVTCNRICPEPASSEQYLCGND 

GVTYSVSACHLRKATCLLGRSIGLAYEGKCIKAK 

SCEDIQCTGGKKCLWDFKVGRGRCSLCDELCPD 

SKSDEPVCASDNATYASECAMKEAACSSGVLLE 

VKHSGSCNSISEDTEEEEEDEDQDYSFPISSILEW 


3537 


A 


285 


2123 


IGLFLQVAPLSVMAKSCPSVCRCDAGFIYCNDRF 

LTSIPTGIPEDATTLYLQNNQINNAGIPSDLKNLL 

K\^IOYLYHNSLDEFPTNLPKYVKELHLQEhWIR 

TITYDSLSKIPYLEELHLDDNSVSAVSIEEGAFRD 

SNYLRLLFLSRNHLSTIPWGLPRTIEELRLDDNRIS 

TISSPSLQGLTSLKRLVLDGNLLNNHGLGDKVFF 

NLVNLTELSLVRNSLTAAPVNLPGTNLRKLYLQ 

DNHINRVPPNAFSYLROLYRLDMSNNNLSNLPQ 

GIFDDLDNITQLILRNNPWYCGCKMKWVRDWL 

QSLPVKVNVRGLMCQAPEKVRGMAIKDLNAELF 

DCKDSGWSTIQITTAIPNTVYPAQGQWPAPVTK 

QPDDCNPKLTKDHQTTGSPSRKTITITVKSVTSDTI 

H1SWKLALPMTALRJLSWLKLGHSPAFGSITETIVT 

GERSE YLV TALEPDSP YKV CMVPMETSNLYLFD 

ETPVCffiTETAPLRNfYNPTTTLNREQEKEPYKNP 

NLPLAAIIGGAVALVTIALLALVCWYVHRNGSLF 

SRNCAYSKGRRRKDDYAEAGTKKDNSILEIRETS 

FQMLPISNEPISKEEFVMTIFPPNGMNLYKNNH 


3538 


A 

• 


877 

• 


6184 


WNVKPSLLVVQLFKFSDKEEHEQNDSISGKTGET 

GVEEMIATRKVEQDSKETVKLSHEDDHILEDAGS 

SDISSDAACIWNKTENSLVGLPSCTVDEVTECNL 

ELKDTMGIADKTENTLERNKIEPLGYCEDAESNR 

QLESTEFNKSNLEVVDTSTFGPESNILENAICDVP 

DQNSKQLNAIESTKIESHETANLQDDRNSQSSSV 

S YLESKS VKSKrniCP VfflSKQNM 1" 1'UAPKKIVAA 

KYEVIHSKTXVWKSVKROTDVPESQQNFHRPV 

KVRKKQIDKEPKIQSCNSGVKSVKNQAHSVLKK 

TLQDQTLVQDFKPLTHSLSDKSHAHPGCLKEPHH 

PAQTGHVSHSSQKQCHKPQQQAPAMKTNSHVK 

EELEOTGVEHFKEEDKLKLKKPEKNLQPRQRRSS 

KSFSLDEPPLFIPDNIATIRREG SDHSSSFESKYMW 

TPSKQCGFCKKPHGNRFMVGCGRCDDWFHGDC 

VGLSLSQAQQMGEEDKEYVCVKCCAEEDKKTEI 

LDPDTLENQATVEFHSGDKTMECEKLGLSKHTT 

NDRTKYmDTVKHKVKILKRESGEGRNSSDCRD 

NEDCKWQLAPLRKMGQPVLPRRSSEEKSEKIPKE 

STTVTCTGEKASKPGTHEKQENtKKKKVVEKGVL 

NVHPAASASKPSADQIRQSVRHSLKDILMKRLTD 

SNLKVPEEKAAKVATKIEKELFSFFRDTDAKYKN 

KYRSLMFNLKDPKNNILFKKVLKGEVTPDHLIR 

MSPEELASKELAAWRRRENRHTTEMIEKEQREVE 

RRPITKITHKGEffiffiSDAPMKEQEAAMEIQEPAA 

NKSLEKPEGSEKXRKEEVDSMSKDTTSQHRQHLF 

DLNCKICIGRMAPPVDDLSPKKVKVVVGVARKH 

SDNEAESIADALSSTSNILASEFFEEEKQESPKSTF 

SPAPRPEMPGTVEVESTFLARLNl^l WKGF1NMPS 

VAKFVTKAYPVSGSPEYLTEDLPDSIQVGGRISPQ 

TVWDYVEKIKASGTKEICVVRFTPVTEEDQISYT 

IXFAYFSSRKItYGVAANNMKQVKDMYLIPLGAT 

DKIPHPLVPFDGPGLELHRPNLLLGUIRQKLKRQ 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= Alanine OCysteine, D=Aspartic Acid, 
E»Glutamic Add, ^Phenylalanine, G^GIyclne, H^HIsndine, 
I=Isoleucine, K^Lysine, L^Leudne, M-Methionlne, 
N^Asparagioe, P»Proline, Q=Glutamine, R=Arginine, S=Serlne, 
T«Tbreonine, V-Valine, W=Tryptophan, Y«Tyrosine, 
X=Unknown, *^Stop cod on, /^possible nucleotide deletion, 
\= possible nucleotide insertion 










HSACASTSHIAETPESAPPIALPPDKKSKIEVSTEE 
APEEENDFFNSFTTVLHKQRNKPQQNLQEDLPTA 
VEPLMEVTKQEPPKPLRFLPGVLIGWENQPTTLE 
Ij\NKPLPVDDILQSLLGTTGQVYDQ\AQSVMEQ 
>TTVKEIPFLNEQTNSKJEKTDNVEVTDGENKEIK 
VKVDNISESTDKSAEIETSVVGSSSISAGSLTSLSL 
RGKPPDVSTEAFLTNLSIQSKQEETVESKEKTLKR 
QLQEDQENNLQDNQTSNS SPCRSNVGKGNIDGN 
VSCSENLVANTARSPQFINLKRDPRQAAGRSQPV 
TTSESKDGDSCRNGEKHMLPGLSHNKEHLTEQIN 
VEEKLCSAEKNSCVOOSDNLKVAONSPSVENIOT 
SQAEQAKPLQEDILMQNIETVHPFRRGSAVATSH 
FEVGNTCPSEFPSKSITFTSRSTSPRTSTNFSPMRP 
QQPNLQHLKSSPPGFPFPGPPNFPPQSMFGFPPHL 
PPPLIJPPPGFG\FA\QNPMVPWPPVV\HLP\GQPQR 
MMGPLSQASRYIGPQNFYQVKDIRRPERRHSDP 
WGRQDQQQLDRPFNRGKGDRQRFYSDSHHLKR 
ERHEKEWEQESERHRRRDRSQDKDRDRKSREEG 
HKDKERARLSHGDRGTDGKASRDSRNVDKKPD 
KPKSEDYEKDKEREKSKHREGEKDRDRYHKDR 
! DHTDRTKSKR 


3539 


A 


157 

• 


1769 


GSWTVELSLKPSASPSLKWVCLPGAAAVNKHRS 

GAGGURSLIQ CTWAPA GPARRGGRGIEDFPYLF 

FQLTHCQQRICSVTQAGVQWCDHSSLQPQTPGL 

NQSSHLSLLSSRDYRMLSSFNEWFWQDRFWLPP 

NVTWTELEDRDGRVYPHPQDLLAALPLALVLLA 

MRLAFERFIGLPLSRWLGVRDQTRRQVKPNATL 

EKHFLTEGHRPKJEPOLSLLAAOCGLTLOOTORW 

FRRRRNQDRPQLTKKFCEAS WRFLFYLSSFVGGL 

SVLYHESWLWAPVMCWDRYPNQLTLSCPAADS 

EAXSLYWWYLLELGFYLSLLIRLPFDVKRKGGGP 

SSIKPRPHYDPPSTA\DFKEQVIHHFVAVILMTFSY 

SANLLRIGSLVLLLHDSSDYLLEACKMVNYMQY 

QQVCDALFLIFSFVFFYTRLVLFPTQILYTTYYESI 

SNRGPFFGYYFFNGLLMLLQLLHVFWSCLELRML 

YSFMKKGQMEKDIRSDVEESDSSEEAAAAQEPL 

QLKNGTAGGPRPAPTDGPRSRVAGRLTNRHTTA 

T 


3540 


A 


267 


1397 


SPAGYCHSGLLPGCSRSA/CADLAKHQELPGKKL 

LSEKKLKRYFVDYRRYLVCGGNGGAGASCFHSE 

PRKEFGGPDGGDGGNGGHV1LRVDQQVKSLSSV 

LSRYQGFSGEDGGSKNCFGRSGAVLYIRVPVGTL 

VKEGGRWADLSCVGDEYIAALGGAGGKGNRF 

FLANNNRAPVTCTPG QPGQQRVLHLELKTV AHA 

GMVGFPNAGKSSLLRAISNARPAVASYPFTTLKP 

HVGIVHYEGHLQIA VADIPGIIRGAHQNRGLG S A 

FLRHBERCRFLLFVVDLSQPEPWTQVDDLKYELE 

MYEKGLSARPHAIVANKIDLPEAQANLSQLRDH 

LGQEVTVLSALTGENLEQLLLHLKVLYDAYAEA 

ELGQGRQPLRW 


3541 


A 


1 


8008 


DTQVSETLKRFAGKVTTASVKERREILSELGKCV 

AGKDLPEGAVKGLCKLFCLTLHRYRDAASRRAL 

QAAIQQLAEAQPEATAKNLLHSLQSSGIGSKAGV 

PSKSSGSAALLALTWTCLLVRTVFPSRAKRQGDI 

WNKXVEVQCLLLLEVLGGSHKHAVDGAVKKLT 
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NO: 


Method 


| Predicted 
1 beginning 
1 nucleotide 
1 location 

corresponding 
1 to first amino 

acid residue of 
i peptide 
j sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanlne OCysteine, D=Aspartic Add, 
E^GIutamic Acid, ^Phenylalanine, G=Glycine, H=Histidlne, 
I°Iso leucine, KpLysine, L=Leuclne, M=Methionine, 
N=Asparaglne, P-Proline, Q=Glutaminc, R«Arginine, S=Serine, 
T=Threonine, V=Valine, W»Tryptophan, Y=Tyrosine, 
X^Unknown, *=»Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 








• 


KLWKENPGLVEQYLSAILSLEPNQNYAQMLGLL 

VQFCTSHKEMDWSQHKSAIXDFYMKNILMSK 

VKPPKYLLDSCAPLLRYLSHSEFKDLILPTIQKSL 

LRSPENVESTISSLLASVTLDLSQYAMDIVKGLAG 

HUCSNSPRU^EAV1j\LIW1^RQCSDSSAMES 

TKHLFAILGGSEGKLTVVAQKMSVLSGIGSVSHH 

WSGPSSQVLNGIVAELFIPFLQQEVHEGTLVHA 

VSVLALWCNRFTMEVPKKLTEWFKKAFSLKTST 

SAVRHAYLQCMLASYRGDTLLQALDLLPLLIQT 

VEKAASQSTQVPTITEGVAAALLLLKLSVADSQA 

EAKLSSFWQLIVDEKKQVFTSEKFLVMASEDAL 

CTVLHVLTERLFLDHPHRLTGNKVQQYHRALVA 

V1XSRTWHVRRQAQQTVRKIXSSLGGFKLAHGL 

LEELKTVLS SHKVLPLEALVTDAGEVTE AGKA Y 

VPPRVLQEALCVISGVPGLKGDVTDTEQLAQEM 

LnSHHPSLVAVQSGLWPALLARMKIDPEAFITRH 

LIXJIIPRMTTQSPLNQSSMNAMGSLSVLSPDRVL 

PQLISTITASVQNPALRLVTREEFAIMQTPAGELY 

DKSEQSAQQDSIKKANMKRENKAYSFKEQIIELE 

LKEEIKKKKGIKEEVQLTSKQKEMLQAQLDREA 

QVRRRLQELDGELEAALGLLDIILAKNPSGLTQYI 

PVLVDSFLPLLKSPLAAPRIKNPFLSLAACVMPSR 

LKALGTLVSHVTLRLLKPECVLDKSWCQEELSV 

AVKRAVMLLHTHTITSRVGKGEPGAAPLSAPAFS 

LVFPFLKMVLTEMPHHSEEEEE WMA QILQILTVQ 

AQLRASPNTPPGRVDENGPELLPRVAMLRLLTW 

VIGTGSPRLQVLASDTLTTLCASSSGDDGCAFAE 

QEEVDVLLCALQSPCASVRETVLRGLMELHMVL 

PAPDTDEKNGLNLLRRLWVVKFDKEEEIRKLAE 

RLWSMMGLDLQPDLCSLLIDDVIYHEAAVRQAG 

AEALSQAVARYQRQAAEVMGRLMEIYQEKLYR 

PPPVLDALGRVISESPPDQWEARCGLALALKKLS 

QYLDSSQVKPLFQFFVPDALNDRHPDVRKCMLD 

AALATLNTHGKENVNSLLPVFEEFLKNAPNDAS 

YDAVRQSVWLMGSLAKHLDKSDPKVKPIVAKL 

IAALSTPSQQVQESVASCLPPLVPAIKEDAGGMIQ 

RLMQQLLESDKYAERKGAAYGLAGLVKGLGILS 

LKQQEMMAALTDAIQDKJCNFRRREGALFAFEM 

LCTMLGKLFEPYVVHVLPHLLLCFGDGNQYVRE 

AADDCAKAVMSNLSAHGVKLVLPSLLAALEEES 

WRTKAGSVELLGAMAYCAPKQLSSCLPNIVPKL 

TEVLTDSHVKVQKAGQQALRQIGSVIRNPEJDLAI 

APVLLDALTDPSRKTQKCLQTLLDTKFVHFIDAP 

SLALIMPIVQRAFQDRSTDTRKMAAQIIGNMYSL 

TDQKDLAPYLPSVTPGUCASLLDPVPEVRTVSAK 

ALGAMVKGMGESCFEDLLPWLMETLTYEQSSV 

DRSGAAQGLAEVMAGLGVEKLEKLMPEIVATAS 

KVDIAPHVRDG YIMMFhTVTLPITFGDKFTP Y VGPH 

PCILKALADENEFVRDTALRAGQRVISMYAETAI 

ALIXPQLEQGLFDDLWRIRFS S VQLLGDLLFHISG 

VTGKMTTETASEDDNFGTAQSNKAIITALGVERR ! 

NRVLAGLYMGRSDTQLWRQASLHVWKIWSN 

TPRTLREI1J > TLFGLIXGFLASTCADKKTIAARTL 

GDLVRKLGEKILPEIIPn .EEGLRSQKSDERQG VCI 

GLSEIMKSTSRDAVLYFSESLVPTARKALCDPLE 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanlne OCystelne, D»Aspartic Acid, 
&=GhJtamie Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
Msoleuclne, K-Lysine, L=Leutine, M=Mcthionine, 
N-Asparaglne, P«Pro!lne, Q=G) a ta mine, R»Arginlne, S=Serine, 
T=Threonine, V«Valine, \V=Tryptopban, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










EVREAAAKTFEQLHSTIGHQALEDILPFLLKQLD 

DEEVSEFALDGLKQVMAKSRWLPYLVPKLTTP 

PVNTRVLAFLSSVAGDALTRHLGVILPAVN1LAL 

KEKLGTPDEQLEMANCQAVILSVEDDTGHRIIIE 

DLLEATRSPEVGMRQAAAHLNIYCSRSKADYTS 

HLRSLVSGLIRLFNDSSPVVLEESWDALNAITKK 

LDAGNQLALIEELHKEIRLIGNESKGEHVPGFCLP 

KKGVTSILPVLREGVLTGSPEQKEEAAKALGLVI 

RLTSADALRPSVVSrTGPLIRILGDRFSWNVKAAL 

LETLSIXLAKVGIALKPFLPOIX>TITTKALODSNR 

GVRLKAADALGKLISIHIKVDPL1 , "1'ELLNGIRAME 

DPGVRDTMLQALRFVIQGAGAKVDA\nKKNIVS 

LLLSMLGHDEDNTRISSAGCLGELCAFLTEEELS 

AVLQQCLLADVSGIDWMVRHGRSLALSVAVNV 

APGRLCAGRYSSDVQEMDLSSATADRIPIAVSGV 

RGMGFLMRHHIETGGGQLPAKLSSLFVKCLQNP 

SSDIRLVAEKMIWWANKDPLPPLDPQAIKPILKA 

LLDNTKDKNTVVRAYSDQAIVNLLKMRQGEEVF 

QSLSKIUDVASLE\a-NEV>TRRSLKKLASQADSTE 

QVDDTILT 


3542 


A 


62 

• 


1130 


PWNPODFPGNRGLMG\OKGEIGPP\GOOGKKGAP 

GMP\GLMGSNGSPGQPGTPGSKGSKGEPGIQGMP 

GASGLKGEPGATGSPGEPGYMGLPGIQGKKGDK 

GNQGEKGIQGQKGENGRQGIPGQQGIQGHHGAK 

GERGEKGEPGVRGAIGSKGESGVDGLMGPAGPK 

GQPGDPGPQGPPGLDGKPGREFSEQF1RQ V CTD V 

IRAQLPVLLQSGRIRNCDHCLSQHGSPGIPGPPGPI 

GPEGPRGLPGLPGRDGVPGLVGVPGRPGVRGLK 

GLPGRNGEKGSQGFGYPGEQGPPGPPGPEGPPGI 

SKEGPPGDPGLPGKDGDHGKPGIQGQPGPPGICD 

PSLCFSVIARRDPFRKGPNY 


3543 


A 


654 


194 


PARSLEKMKASVVLSLLGYLVVPSGAYILGRCTV | 
AKKLHDGGIJ5YFERYSLE>nVVCLAYFESKFNPS\ 
AIYENTREGYTGFGLFQMRGSDWCGDHGRNRC 
HMSCSALLNPNLEKTDCCAKTIVKGKEGMGAWP 
T WSRYCQ YSDTL ARWLDG CKL 


3544 


A 


2 


1074 


SCRLAAGRLAQWLLRASRSGMLRAGWLRGAAA 

LALLLAARWAAFEPirVGLAIGAASAITGYLSY 

NDIYCRFAECCREERPLN ASALKLDLEEKLFG QH 

LATEVIXFKALTGFRNNKNPKKPLTLSLHGWAGT 

GKNFVSQMGAEN1JIPKGLKSNFVHLFVSTLHFP 

HEQKDCL YQDQLQKWIRGNVS AC AN S VFIFDEM 

DKX\HPGIIEVAJKPFLDYYEHVERVSYR\KAIFIFLS 

NAGGDLITKTALDFWRAGRKREDIQLKDLEPVL 

SVGVFNNKHSGLWHSGLIDKNLIDYFIPFLPLEYR 

HVKMCVRAEMRARGSAIDEDIVTRVAEEMTFFPX 

RDEKIYSDKGCKTVQSRLDFH 


3545 


A 


3 


273 


SAQGRSWGRFYRQQCRHPGIIPMIGLICLGMGSA 

ALYlXRIJaJlSPDVW*SWDRKNl^EPWNRLSPN 

DQYKFLAVSTDYKKLKKDRPDF 


3546 


A 


23 


591 


ALSTETRTPDMRRLLLVTSLVVVLLWEAGAVPA 

PKVPIKMQVKHWPSEQDPEKAWGARVVEPPEK 

DDQLVVLFPVQKPKLLTTEEKPRGQGRGPILPGT 

KAWMETEDTLGRVLSPEPDHDSLYHPPPEEDQG 

EERPRLWVMFNHQVLLGPEEDQDHIYHPQ*GSR 
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| SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A=Alanine C=Cysteine, D^Aspartic Add, 
E-Glutamic Add, ^Phenylalanine, G»Glydne, H^Histidine, 
I=Isotcudnc v K-Lysine, L^Lcudne, M«Methlonlne, 
N=Asparagine, P^Proline, Q=G Iuta mine, R^ArgtnJne, S^Serine, 
T»Thrcontae, V- Valine, W-Tryptophan, Y=Tyrosine, 
X^Un known, *=Stop codon, /"possible nucleotide deletion, 
Y=possible nucleotide insertion 










GHHCPRPVPRPRLLGLGPSLPCPS 


3547 


A 


23 


591 


ALSTETRTPDMRRLLLVTSLVVVLLWEAGAVPA 

PKVPIKMQVKHWPSEQDPEKAWGARWEPPEK 

DDQLWLFPVQKPKLLTTEEKPRGQGRGPILPGT 

KAWMETEDTLGRVLSPEPDHDSLYHPPPEEDQG 

EERPRLWVMPNHQVLLGPEEDQDHIYHPQ*GSR 

GHHCPRPVPRPRLLGLGPSLPCPS 


3548 


A 


3 

• 


1641 


TWLPSVPAEEVQQPEMAAVLNAERLEVSVDGLT 

LSPDPEERPGAEGAPLAAATAATALATWIRSRPG 

RLRGTARSPGRRAAGGAAEEARRLEQRWGFGLE 

ELYGLALRFFKJBKDGKAFHPTYEEKLKLVALHK 

QVLMGPYNPDTCPEVGFFDVLGNDRRREWAAL 

GNMSKEDAMVEFVKLLNRCCHLFSTYVASHKIE 

KEEQEKKRKEEEERRRREEEERERLQKEEEKRRR 

EEEERLRREEEERRRIEEERLRLEQQKQQEMAAL 

NSQTAVQFQQYAAQQYPGNYEQQQILIRQLQEQ 

HyQQYMQQLYQVQLAQQQAALQKQQEWVAG 

SSLPTSSKVEC^CIXJVI*CQFNRQAKTHTDSSEKE 

LEPEAAEEALENGPKESLPVIAAPSMWTRPQIKD 

FKEKIQQDADSVITVGRGEVVTVRVPTHEEGSYL 

FWEFATDNYD1GFGVYFEWTDSP,NTAVSVHVSE 

SSDDDEEEEEN1GCEEKAKKNANKPLLDEIVPVY 

RRDCHEEVYAGSHQYPGRGVYLLKFDNSYSLW 

RSKSVYYRVYYTR 


3549 


A 


1837 


3593 


PAVLVLEPASQSRKQQNTASATAQHWSAQIHKE 

SFLAPVFTKDEQKHRRPYEFEVERDAKARGLEQF 

SATHGHTPIILNGWHGESAMDLSCSSEGSPGATS 

PFPVSASTPK1GAISSLQGALGMDLSGILQAGLIHP 

VTGQIVNGSLRRDDAATRRRRGRRKHVEGGMD 

LIFLKEQTLQAGILEVHEDPGQATLSTTHPEGPGP 

ATSAPEPATAASSQAEKSIPSKSLLDWLRQQADY 

SLEVPGFGANFSDKPKQRRPRCKEPGKLDVSSLS 

GEERVPAIPKEPGLRGFLPENKFNHTLAEPILRDT 

GPRRRGRRPRSELLKAPSIVADSPSGMGPLFMNG 

LIAGMDLVGLQNMRNMPGIPLTGLVGFPAGFAT 

MPTGEEVKSTLSMLPMMLPGMAAVPQMFGVGG 

LI^PPMATTCTSTAPASLSSTTKSGTAVTEKTAE 

DKPSSHDVKTDTLAEDKPGPGPFSDQSEPAITTSS 

PVAFNPrXIPGVSPGLIYPSMFI^PGMGMAIJAM 

QQARHSEIVGLESQKRKKKKTKGDNPNSHPEPA 

PS CEREPSGDENCAEPSAPLPAEREHG AQ AGEG A 

LKDSNNDTN 


3550 


A 


287 


39 


QLNLNKIATSQKHRDFVAESVGEKPVGSLAGIGE 
VMDKKLEEGCFDKAYWLGQFLVLKKDEDLF*E 
WLRDTG G ARTRG SRE 


3551 


A 


21 


3925 


GDLLEVGLPPGLEFPRGICLRGLRRTMSLDFGSV 

ALPVQNEDEEYDEEDYEREKELQQLLTDLPHDM 

LDDDLSSPELQYSDCSEDGTDGQPHHPEQLEMS 

WhmQMLPKSQSVNGPSCQGLEPYNKVTYKPYQS 

SAQNNGSPAQEITGSDTFEGLQQQFLGANENSAE 

NMQEQLQVLNKAKERQLENLIEKLNESERQIRY 

LNHQLVDKDEKDGLTLSLRESQKLFQNGKEREIQ 

LEAQ1KALETQIQALKVNEEQMIKKSRTTEMALE 

SLKQQLVDLHHSESLQRAREQHESIVMGLTKKY 

EEQVI^LQKMLDATVTALKEQEDICSRLKDHVK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

ncid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIonine OCystelne, D=Aspartic Acid, 
E=Clutaraic Acid, ^Phenylalanine, G<=Glycine, H=Hlstidlne, 
£=>Isoleucine, K=Lysine, L^Leucine, M=Metblonine, 
N^Asparagine, P^Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V^Voline, W=Tryptophan, YaTyrosine, 
X«Unknown t *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 






* 


* 


QLERNQEAIKLEKTEIINKLTRSLEESQKQCAHLL 

QSGSVQEVAQLQFQLQQAQKAHAMSANMNKA 

LQEELTELKDEISLYESAAKLGIHPSDSEGELNIEL 

TESYVDLGIKKVNWKKSKVTSIVQEEDPNEELSK 

DEFILKLKAEVQRLLGSNSMKRHLVSQLQNDLK 

DCHKKIEDLHQVKKDEKSIEVETKTDTSEKPKNQ 

LWPESSTSDWRDDILLLKNEIQVLQQQNQELKE 

TEGKLRNTNQDLCNQMRQMVQDFDHDKQEAV 

DRCERTYQQHHEAMKTQIRESLLAKHALEKQQL 

FEAYERTHLQLRSELDKLNKEVTAVQECYLEVC 

REKDNLELTLRKTTEKJEQQTQEKIKEKLIQQLEK 

EWQSKLDQTIKAMKXKTLDCGSQTDQVTTSDVI 

SKKEMAIMIEEQKCTIQQNLEQEKDIAIKGAMKK 

LEIELELKHCENITKQVEIAVQNAHQRWLGELPE 

LAEYQALVKAEQKKWEEQHEVSVNKRISFAVSE 

AKEKWKSELENMRKNILPGKJELEEKIHSLQKELE 

LKKEEVPWIRAELAJCARSEWhnCEKQEEIHRIQE 

QNEQDYRQFLDDHRNKINEVLAAAKEDFMKQK 

TELLLQKETELQTCLDQSRREWTMQEAKRIQLEI 

YQYEEDILTVLGVLLSDTQKEHISDSEDKQLLEI 

MSTCSSKWMSVQYFEKLKGCIQKAFQDTLPLLV 

ENADPEWKKRNMAELSKDSASQGTGQGDPGPA 

AGHHAQPLALQATEAEADKKKVLEIKDLCCGHC 

FQELEKAKQECQDLKGKLEKCCRHLQHLERKHK 

AVVEKIGEEN>^CVVEELIEENNDMK^^CLEELQT 

LCKTPPRSLS AGAIENACLPCS GGALEELRGQYIK 

AVKKIKCDMLRYIQESKERAAEMVKAEVL*ERQ 

ETARKMRKYYLICLQQILQDDGKEGAEKKIMNA 

ASKLATMAKLLK1PISSKSQSKTTQSGMSK 


3552 


A 


771 


375 


ARTRQTSGQAREPEKESPAPGGGGLAEIRSRQQL 
SQTSRIPPLAKDQAVEAMFPPARGKELLSFEDVA 
MYFTREEWGHLNWGQKDLYRDVMLENYRNMV 
LLVYFQFDAAIPLC*TSLAHSSWLQLYFRLYF 


3553 


A 


76 


72 


PGVRGVEAPGGVAPGRNAMRRGERRDAGGPRP 

ESPVPAGRASLEEPPDGPSAGQATGPGEGRRSTE 

SEVYDIX3TOTFFWRAHTLTVLFILTCTLGYVTLL 

EETPQDTAYNTKRGIVASILVFLCFGVTQAKDGP 

FSRPHPAYWRFWLCVSWYELFLIFILFQTVQDG 

RQFLKYVDPKLGVPLPERDYGGNCLIYDPDNET 

DPFHNIWDKLDGFWAHFLGWYLKTLMIRDWW 

MCMnSVMFEFLEYSLEHQLPNFSECWWDHWIM 

DVLVCNGLGIYCGMKTLEWLSLKTYKWQGLWN 

IPTYKGKMKRIAFQFTPYSWVRFEWKPASSLRR 

WLAVCGIILVF1,LAELNTTYUCFV1.WMPPEHYLV 

LUO.VFFVNVGGVAMREIYDFMDDPKPHKKLGP 

QAWLVAAITATELLIWKYDPHTLTLSLPFYISQC 

WTLGSVLALTWTVWRFFLRDITLRYKETRWQK 

WQNKDDQGSTVGNGDQHPLGLDEDLLGPGVAE 

GEGAPTPN*PRGPAPRPLPSAPRAVCGASSRR 


3554 


A 


2 


2106 


FDEFSALPSPSLQTSWSFGPMSRRALRRLRGEQR 

GQEPLGPGALHFDLRDDDDAEEEGPKRELGVRR 

PGGAGKEGVRVNNRFELINIDDLEDDPWNGERS 

GCALTOAVAPGNKGRGQRGNTESKTDGDDTET 

VPSEQSHASGKLRKKKKKQKNKKSSTGEASENG 

LEDEDRILERIEDSTGLNRPGPAPLSSRKHVLYVE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine O Cysteine, D=Aspartic Add, 
{^Glutamic Add, ^Phenylalanine, OGlycine, H~Histidine, 
I=Isoleudne, K=Lysine, L=Leudne, M=Methionine, 
N=Asparogine, P=ProIine, Q=Gtutamine, R=ArgJnine, S="Serioe, 
T^Tbreonine, V«Va1lne, W=Tryptophan, Y^Tyroslne, 
X»Unkoown, *=Stop codon, ^possible nucleotide deletion, 
^possible nudeotide insertion 




• 






HRHLNPDTELKRYFGARAILGEQRPRQRQRVYP 

KCTWLTTPKSTWPRYSKPGLSMRLLESKKGLSFF 

AFEHSEEYQQAQHKFLVAVESMEPNNIVVLLQT 

SPYHVDSLLQLSDACRJFQEDQEMARDLVERALY 

SMECAFHPLFSLTSGACRLDYRRPENRSFYLALY 

KQMSFLEKRGCPRTALEYCKLILSLEPDEDPLCM 

LLLIDHLALRARNYEYLIRLFQEWEVGASLAHRN 

LSQLPNFAFSVPLAYFLLSQQTOLPECEQSSARQ 

KA SLLIQQALTMFPG VLLPLLESCS VRPDAS VSSH 

RFFGPNAEISQPPALSQLYNLYLGRSHFLWKEPA 

TMSWLEENVHEVLQAVDAGDPAVEACENRRKV 

LYQRAPRNIHRHVILSEIKEAVAALPPDVTTQSV 

MGFDPLPPSDTIYSYVRPERLSPISHGNT1ALFFRS 

LLPNYTMEGERPEEG VAG GLNRNQGLNRLMLA 

VRDMMANFHLNDLEAPHEDDA * GE GE WD 


3555 


A 


2 


2106 


FDEFSALPSPSLQTSWSFGPMSRRALRRLRGEQR 

GQEPLGPGALHFDLRDDDDAEEEGPKRELGVRR 

PGGAGKEGVRVNNRFELINIDDLEDDPWNGERS 

GCALTDAVAPGNKGRGQRGNTESKTOGDDTET 

VPSEQSHASGKLRKKKKKQKNKKSSTGEASENG 

LEDIDRJLERJDEDSTGLNRPGPAPLSSRKHVLYVE 

HRHLNPDTELKRYFGARAILGEQRPRQRQRVYP 

KCTWLTTPKSTWPRYSKPGLSMRLLESKKGLSFF 

AFEHSEEYQQAQHKFLVAVESMEPNNIVVLLQT 

SPYHVDSLLQLSDACRFQEDQEMARDLVERALY 

SMECAFHPLFSLTSGACRLDYRRPENRSFYLALY 

KQMSFLEKRGCPRTALEYCKLILSLEPDEDPLCM 

LLLIDHLALRARNYEYLIRLFQEWEVGASLAHRN 

LSQLPNFAFSVPLAYFLLSQQTDLPECEQSSARQ 

KASLLIQQALTMFPGVLLPLLESCSVRPDASVSSH 

RFFGPNAEISQPPALSQLVNLYLGRSHFLWKEPA 

TMSWLJSENVHEVLQAVDAGDPAVEACENRRKV 

LYQRAPRNIHRHVILSEIKEAVAALPPDVTTQSV 

MGFDPLPPSDTIYSYVRPERLSPISHGNTIALFFRS j 

LLPKyTMEGERPEEGVAGGLNRNQGLNRLMLA 

VRDMMANFHLNDLEAPHEDDA*GEGBWD 


3556 

i 


A 


3388 


1650 


KTRGTMFYYPNVLQRHTGCFATIWLAATRGSRL 

VKREYLRVNVVKTCEEILNYVLVRVQPPQPGLP 

RPRFSLYLSAQLQIGVIRVYSQQCQYLVEDIQHIL 

ERLHRAQLQIRIDMETELPSLLLPNHLAMMETLE 

DAPDPFFGMMSVDPRLPSPFDIPQIRHLLEAAIPE 

RVEEIPPEVPT^REPERIPVTVLPPEAITILEAEPIR 

MLEIEGERELPEVSRRELDLLIAEEEEAILLEIPRL 

PPPAPAE*GQELLDQVGCQCWEG SPHFSCPFPLR 

VEGMGEALGPEELRLTGWEPGALLMEVTPPEEL 

RLPAPPSPERRPPVPPPPRRRRRRRLLFWDKETQI 

SPEKFQEQLQTOAHCWECPMVQPPERTIRGPAEL 

FRTPTLSGWLPPELLGLWTHCAQPPPKALRRELP 

EEAAAEEERRKIEVPSEIEVPREALEPSVPLMVSL 

EISLEAAEEEKSRISLIPPEERWAWPEVEAPEAPA 

LPWPELPEVPMEMPLVLPPELELLSLEAVHRAV 

AT .FT XJANREPDFSSLVSPLSPRRMAARVFYLLLV 

LSAQQILHVKQEKPYGRLLIQPGPRFH 


3557 


A 


3388 


1650 


KTRGTMFYYPNVLQRHTG CF ATIWL AATRG SRL 
VKREYLRVNVVKTCEEILNYVLVRVQPPQPGLP 
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SEQ1D 
NO: 


Method 


1 Predicted 

beginning 

nucleotide 
j location 
| corresponding 
1 to first amino 

acid residue of 

peptide 
j sequence 


Predicted end 
nucleotide 
! location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E=GI atomic Acid, ^Phenylalanine, OGIydne, H«=HJstidine, 
l^Xsoleucine, K=Lysine, L^Leucint, M=Methioninc, 
N a Asparagine« P=Proline, Q=Glutamine, R*>Arginine, S^Serlne, 
T»Tbreonine, V«Valine, W^Tryptophan, Y«Tyrosine f 
X=Unknown, *«Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










RPRFSLYLSAQLQ1GVIRVYSQQCQYLVEDIQHIL 

ERLHRAQLQIRIDMETELPSLLLPNHLAMMETLE 

DAPDPFFGMMSVDPRLPSPFDIPQERHLLEAAIPE 

RVEEIPPEVPTEPREPERIPVTVLPPEAITILEAEPIR 

MLEIEGERELPEVSRRELDLLIAEEEEAILLEIPRL 

PPPAPAE* GQELLDQVGCQCWEGSPHFSCPFPLR 

VEGMGEALGPEELRLTGWEPGALLMEVTPPEEL 

RLPAPPSPERRPPVPPPPRRRRRRRLLFWDKETQI 

SPEKFQEQLQTRAHCWECPMVQPPERTIRGPAEL 

FR1 PI LSG WLPPELLGLWTOCAQPPPKALRRELP 

EEAAAEEERRKIEVPSEIEVPREALEPSVPLMVSL 

EI SLEAAEEEKSRISLIPPEERWA WPEVE APEAP A 

tPVVPELPEVPMEMPLVLPPELELLSLEAVHRAV 

ALELQANREPDFSSLVSPLSPRRMAARVFYLLLV 

LSAQQILHVKQEKPYGRLLIQPGPRFH 


3558 

» 


A 


489 


2360 

• 


IRPRPRGRRRALDSPNAAAPPVYVCRSPGEPTSL 

VI^IMASEDIAKLAETLAKTQVAGGQLSFKGKSLK 

LNTAEDAKDVIKEIEDFDSLEALRLEGNTVGVEA 

ARVL\KAL*KKSELKRCHWSDMFTGRLRTEIPPA 

LISLGEGLITAGAQLVELDLSDNAFGPDGVQGFE 

ALLKSSACrHTLQELKLNNCGMGIGGGKILAAALT 

ECHRKSSAQGKPLALKVFVAGRNRLENDGATAL 

AEAFRVIGTLEEVHMPQNGINHPGITALAQAFAV 

NPLLRVI^NDNTrT^KGAVAlVLAETLKTLRQVE 

VINFGDCLVRSKGAVAIADAIRGGLPKLKELNLS 

FCEIKRDAALAVAEAMADKAELEKLDLNGNTLG 

EEGCEQLQEVLEGFNMAKVLASLSDDEDEEEEE 

EGEEEEEEAEEEEEEDEEEEEEEEEEEEEEPQQRG 

QGEKSATPSRKILDPNTGEPAPVLSSPPPADVSTF 

LAFPSPEKLLRLGPKSSVLIAQQTDTSDPEKWSA 

FLKVSSVFKDEATVRMAVQDAVDALMQKAFNS 

SSFNS>TITLTRLXVHMGLLKSEDKVKAIANLYGP 

LMALNHMVQQDYFPKALAPLLLAFVTKPNSALE 

SCSFARHSLLQTLYKV 


3559 


A 


489 


2360 


IRPRPRGRRRALDSPNAAAPPVYVCRSPGEPTSL 

VN1V1ASEDIAKLAETLAKTQVAGGQLSFKGKSLK 

LNTAEDAKDVDCEIEDFDSLEALRLEGNTVGVEA 

ARVIAKAL*KKSELKRCHWSDMFTGRLRTEIPPA 

LISLGEGLITAGAQLVELDLSDNAFGPDGVQGFE 

AT ,1 -KSS ACFTLQELKLNNCGMGIGGGKDLAAALT 

ECHRKSSAQGKPLALKVFVAGRNRLENDGATAL 

AEAFRV1GTLEEVHMPQNGINHPGITALAQAFAV 

NPLLRVINLNDNTFTEKGAVAMAETLKTLRQVE 

VINFGDCLVRSKGAVAIADAIRGGLPKLKELNLS 

FCTIKRDAAIJ^VAEAMADKAELEKLDLNGNTLG 

EEGCEQLQEVLEGFNMAKVLASLSDDEDEEEEE 

EGEEEEEEAEEEEEEDEEEEEEEEEEEEEEPQQRG 

QGEKSATPSRKILDPNTGEPAPVLSSPPPADVSTF 

LAFPSPEKLLRLGPKSSVLIAQQTDTSDPEKVVSA 

FLKVSSVFXDEATVRMAVQDAVDALMQKAFNS 

SSFNSOTFLTRLL VHMGLLKSEDK VKA1ANL YG P 

LMALNHMVQQDYFPKAIjU>LIXAFVTKPNSALE 

SCSFARHSLLQTLYKV 


3560 


A 


2 


1198 


FVRELPRPRPGAATAAIMVSVimVDTSHEDMIH 
DAQMDYYGTRLATCSSDRSVKIFDVRNGGQILIA 
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"SlQlD 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence <A=Alanine OCysteine, D^Aspartic Add, 
E«=Ghitaraic Add, F«Phcnyl alanine, (^Glycine, H^Histidine, 
l^Isoleucine, K=Lysine, L^Leudne, M=Methionlne, 
N»Asparaglne, P=»ProHue, Q=Glutarof ne, R=Argioine, S^Sertne, 
T«=Threonine, V=Valine, W»Tryptophan, Y=Tyrosine, 
X=Un known, *==Stop cod on, /= possible nucleotide deletion, 
\=possible nudeotide insertion 

• 










DLRGHEGPVWQVAWAHPMYGNILASCSYDRKV 

IIWREENGTWEKSHEHAGHDSSVNSVCWAPHDY 

GLDLACGSSDGAISLLTYTGEGQWEVKKINNAHT 

IGCNAVSWAPAVVPGSLIDHPSGQKPNYIKRFAS 

GGCDNLIKLWKEEEDGQWKEEQKLEAHSDWVR 

DVAWAPSIGLPTST1ASCSQDGRVFTWTCDDASS 

NTWSPKLLHKFNDVVWHVSWS1TAN1LAVSGGD 

NKVTLWKESVDGQWVCISDVNKGQGSVSASVT 

EGQQNEQ*QDRWGLAPHPPAPGLPLPGPTNQTT 

GKSPQLQQDYFPRRSYRCSHRLDCLNVIGDAL 


3561 


A 


540 

* 


86 


WRVKEMTSTLPKALGRKTASRSHTTLQGGSCCP 
VLWTAKLRCRKLRFPLPPPPPSSSAWPWQGWGI 
RGEQEAEGPLGETGPPVGPELSGLRQWRKLIKGR I 
YGEWRGSGQKTGQPS *TTMQGGETEENRTETTT 
GNKQRESEAPWVRHTYIT 


3562 


A 


1920 


242 


PMMAMPFFERFKS SIQRPSP VLVLSQNTKRESGR 

KVQSGNmAAKTlADIIRTCLGPKSMMKMLLDP 

MGGIVMTNDGNAILREIQVQHPAAKSMIEISRTQ 

DEEVGDGTTSVnLAGEMLSVAEHFLEQQMHPTV 

\aSAYRKALDDMISTLKKISIPVDISDSDMMLNIIN 

SSITTKAISRWSSLACNIALDAVKMVQFEENGRK 

EIDIKKYARVEKIPGGIIEDSCVLRGVMINKDVTH . 

PRMRRYDCNPRIVLLDSSLEYKKGESQTDIEITRE 

EDFTRILQMEEEYIQQLCEDUQLKPDWITEKGIS 

DLAQHYLMRANITAIRRVRKTDNNRIARACGARI 

VSRPEELREDDVGTGAGLLEKiaGDEYFTFITDC 

KDPKACTILLRGASKEILSEVERNFQDAMQVCRN 

VLLDPQLVPGGGASEMAVAHALTEKSKAMTGV 

EQWPYRAVAQALEVEPRTLIQNCGASTERLLTSLR 

AKHTQENCETWGVNGETGTLVDMKELGIWEPL 

AVKLQTYKTAVETA\HLLIJRIDDrVSGHKKKGDD 

QSRQGGAPDAGQE 


3563 


A 


1571 


560 


gpsllgtrg tpnpartlqiffliigrrltgrmaa v 
ddlqfeefgnaatsltawdattvniedpgetpk 
hqpgsprGsgreeddellgnddsdktellagqk 

KSSPFWTFEYYQTFFDVDTYQVFDRIKGS1XP1PG 

KNFVRLYIRSNPDLYGPFWICATLVFAIAISGNLS 

NFLIHLGEKTYHYVPEFRKVSIAATIIYAYAWLVP 

LALWGFLMWRNSKVMNIVS YSFLEIVCVY GYSL 

FIYIPTAILWIIPHKAWWILVMIALGISGSLLAMT 

FWPAVREDNRRVALATIVTIVl^lJHMLLSVGCLA 

Y FFD A PEMDHLPTTTATPNQTV AAAKS S 


3564 


A 


1 


328 


NSRVDDFVAHLQRPLLGPASCLGILRPAMTAHSF 
ALPGnFTTFWGLVGIAGPWFVPKGPNRGVIITML 
VATAVCCYLFWLIAILAQLNPLFGPQLKNETIWY 
VRFLWE 


3565 


A 


2 


1081 


FVTDFPARSMAATSLMSALAARLLQPAHSCSLRL ] 

RPFHLAA VRNE A V VISGRJCLAQQIKQE VRQE VEE 

WVASGNKRPHLSVILVGENPASHSYVLNKTRAA 

AVVGmSETIMKPASISEEELLNLINKLNNDDNVD 

GLLVQLPLPEHIDERRICNAVSPDKDVDGFHVIN 

VGRMCLDQYSMLPATPWGVWEIIKRTGIPTLGK 

NVVVAGRSKNVGMPIAMLLHTDGAHERPGGDA 

TVTISHRYTPKEQLKKiniLADIVISAAGffNLITA 

DMIKJEGAAVTDVGINRVHDPVTAKPKLVGDVDF 
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1 SEQ1D 
NO: 


| Method 


1 Predicted 

beginning 

nucleotide 

location 

corresponding 
j to first amino 

acid residue of 
I peptide 
| sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A=*Alanlne OCystetne* D-Aspartic Acid, 
E=Cluramic Acid, ^Phenylalanine, G^Glyrine, H=Histidinc, 
I=Isoleudne, K= Lysine, LHLeudne, M=Methionine, 
N^Asparaglne,P»Proline, Q^Glatamine, R»Arglnine,S=Serine, 
T^Threonine, V»Valine, W»Tryptophan, Y^Tyrosine, 
X=Unkno\vn, *=3top codon, ^possible nudeotide deletion, 
^possible nucleotide insertion 










EGVRQKAGYITPVPGGVGPMTVAMLMKNTOAA 
KKVLRLEEREVLKSKELGVATN 


3566 


A 


3 


1130 


SCRRGRQQQRRNVSLSSQFAHTMAAPAQQTTQP 

GGGKRKGKAQYVLAKRARRCDAGGPRQLEPGL 

QGILITCNMNERKCVEEAYSLLNEYGDDMYGPE 

KFTOKDQQPSGSEGEDDDAEAALKKEVGDIKAS 

TEMRLRRFQSVESGANNWFIRTLGIEPEKLVHHI 

LQDMYKTKKKKTRVILRMLPISGTCKAFLEDMK 

KYAETFLEPWFKAPNKGTFQrVYKSRNNSHVNR 

EEVIRELAGIVCTLNSENKVDLTNPQYTVVVEIIK 

AVCCLSWKDYMLFRKYNLQEVVKSPKDPSQLN 

SKQGNGKEAKLESADKSDQNNTAEGKNNQQVP 

ENTEELGQTKPl'SNPQVVNEGGAKPELASQATE 

GSKSNENDFS 


3567 


A 


248 


3498 


GKKDSSPWTCPFHPPLQLFFVIRNTRQLGDFHLA 

KIKVRNYWTADGD1JDIGAKNVKLYVNKNLIFNG 

KLDKGDREAPADHSILVDQKNEKSEQJLEEAMNA 

HSEESKGTHEMAGASGDKELGLGCSPPAETLAD 

AKLSSQGNVSGKRKNSTNCRKDSLSQLEEYLRJLS 

AVPTSMGDMPSAPATSPPVKCPPVHEEPSLIQQL 

ENLMGRKICEPPGKTPSWLQPSPTGKDRKQGGR 

KPKPLWLSPEKPLA WKGRLPSDD VIG EGPGETEA 

RDKGLRHEPGWGTSRSVNTKERPQRATTKVHSD 

DSDEFNQPPNRERPASGRRGSRKDAGSSSHGDDQ 

PASREDTWSSRTPSRSRWRSEQEHTLHESWSSLS 

AFDRSHRGRISNTELPGDILDELLQQKS SRHSDLP 

PSKKGEQPGLSRGQDGYSGETDAGGDFKIPVLPY 

GQRLVIDnCSTWGDRHYVGLNGffilFSSKGEPVQI 

SNIKADPPDmiLPAYGKDPRVVTNLlDGVNRTQ 

DDMHVWLAPFTRGRSHSITIDFTHPCHVALIRIW 

NYNKSRfflSFRGVKBITMLLDTQClFEGEIAKASG 

TLAGAPEHFGDTILFTTDDDILEAIFYSDEMFDLD 

VGSLDSLQDEEAMRRPSTADGEGDERPFTQAGL 

GADERIPELELPSS SP VPQ VTTPEPGI YHGICLQLN 

FTASWGDLHYLGLTGLEWGKEGQALP1HLHQ1S 

ASPRDLNELPEYSDDSRTLDKLIDGTNITMEDEH 

MWLIPFSPGLDHVVTIRLDRAESIAGLRFWNYNK 

SPEDTYRGAKIVHVSLDGLCVSPPEGFLIRKGPG 

NCHFDFAQEILFVDYLRAQLLPQPARRLDMRSLE 

CASMDYEAPLMPCGFIFQFQLLTSWGDPYYIGLT 

GLELYDERGEKIPLSENNIAAFPDSVNSLEGVGG 

DVRTPDKLIDQVNDTSDGRHMWLAPILPGLVNR 

VYVIFDLFrrVSMIKLWNYAKTPHRGVKEFGLL 

VDDLLVYNGIlJVMVSHLVGGILPTCEPTVPYHTI 

LFTEDRDIRHQEKHTTISNQAEDQDVQMMNENQ ; 

nTNAKRKQSWDPALRPKTCISEKETRRRRC 


3568 


A 


50 


1724 


AQGGTLSAASRFCRGGLLGPWLHPASEMAATLD 

LKSKEEKDAELDKR1EALRRKNEALIRRYQEIEE 

DRKKAELEGVAVTAPRKGRSVEKENVAVESEKN 

LGPSRRSPGTPRPPGASKGGRTPPQQGGRAGMG 

RASRSWEGSPGEQPRGGGAGGRGRRGRGRGSPH 

LSGAGDTSISDRKSKEWEERRRQNBEKMNEEME 

KIAEYERNQREGVLEPNPVRNFLDDPRRRSGPLE 

ESERDRREESRRHGRNWGGPDFERVRCGLEHER 

QGRRAGLGSAGDMTLSMTGRERSEYLRWKQER 
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SEQJD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acta resiaue oi 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
pepnoe 
sequence 


Amino acid sequence (A-AJanlne OCysteine, B=Aspartic Acid, 
E^Glutamic Acid* F^-Pbenylalanine, OGlycinc, HHBistJdine, 
I»IsoIeurfne, K«=»Lysine, L== Leucine, M=Methionioe, 
N=Asparagine, P=Proline, Q=Glutaraine, R-Argininc, S=Serinc, 
T«Threonine, V«Valine, W«Tryptophan, Y«Tyroslne, 
X«Unknown, *«Stop codon, ^possible nucleotide deletion, 
impossible nucJeofiae insertion 










EKIDQERLQRHRKPTGQWRREWDAEKTDGMFK 

DGPVPAHEPSHRYDDQAWARPPKPPTFGEFLSQ 

HKAEAS SRRRRKSSRPQ AKAAPRA YS DHDDRWE 

TKEGAASPAPETPQPTSPETSPKETPMQPPEIPAP 

AHRPPEDEGEENEGEEDEEWEDISEDEEEEEBEVE 

EGDEEEPAQDHQAPEAAPTGIPCSEQAHGVPFSP 

EEPLLEPQAPGTPSSPFSPPSGHQPVSDWGEEVEL 

NSPRTTHLAGALSPGEAWPFESV 


3569 


A 


1 


912 


MGRVGRAGVQLGRRRTTWAAERTGQAAAGGP 

GRALRGQRPDLRSGGAADSPAAGRGELYCGVLP 

RSPWFLSERRRQMADFDTYDDRAYSSFGGGRGS 

RGSAGGHGSRSQKELPTEPPYTAYVGNLPFNTV 

QGDmAIFKDLSIRSVRLVRDKDTDKFKGFCYVE 

FDEVDSLKEALTYDGALLGDRSLRVDIAEGRKQ 

DKGGFGFRKGGPDDRGFRDDFLGGRGGSRPGDR 

RTGPPMGSRFRDGPPLRGSNMDFREPTEEERAQR 

PRLQLKPRTVATPLNQVANPNSAIFGGARPREEV 

VQKEQE 


3570 


A 


1 


912 


MGRVGRAGVQLGRRRTT WAAERTGQA AA GGP 

GRALRGQRPDLRSGGAADSPAAGRGELYCGVLP 

RSP WFLSERRRQMADFDTYDDRA YSSFGGGRG S 

RGSAGGHGSRSQKELPTEPPYTAYVGNLPFNTV 

QGDIDAJDFKDLSIRSVRLVRDKDTDKFKGFCYVE 

FDEVDSLKEALTYDGALLGDRSLRVDIAEGRKQ 

DKGGFGFRKGGPDDRGFRDDFLGGRGGSRPGDR 

RTGPPMGSRFRDGPPLRGSNMDFREPTEEERAQR 

PRLQLKPRTVATPLNQVANPNSAEFGGARPREEV 

VQKEQE 


3571 


A 


28 


131 


RHFFGNLCAMRAKWRKXRMRRLKRKRRKMRQ 
RSK 


3572 


A 

■ 


3 


1202 


QSEPHRKVRVDPPVRDRPPPHPPPLLVQRALPGQ | 

GQAEGSDGADGAKRRAMAHQTGIHATEELKEFF 

AJCARAGSVRLDCVVrEDEQLVLGASQEPVGRWD 

QDYDRAVLPLLDAQQPCYLLYRLDSQNAQGFE 

WLFLAWSPDNSPVRLKMLYAATRATVKKEFGG 

GHIKDELFGTVKDDLSFAGYQKHLSSCAAPAPLT 

SAERELQQERINEVKTEISVESKHQTLQGLAFPLQ 

PEAQRALQQLKQKMVNYIQMKLDLERETIELVH 

TEPTDVAQLPSRVPRDAARYHFFLYKHTHEGDP 

LESWFIYSMPGYKCSIKERMLYSSCKSRLLDSV 

EQDFHLEIAKKIEIGDGAELTAEFLYDEVHPKQH 

AFKQAFAKPKGPGGKRGHKRLIRGPGENGDDS 


3573 

* 


A 


49 

* 


1869 

• 


PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEV 

EEISLLQPQVEESVLNLGKFHSIVRLVAFCPFASS 

QVALENANAVSEGVVHEDLRLLLETHLPSKKKK 

VLLGVGDPKIGAAIQEELGYNCQTGGVIAEILRG 

VRLHFHKLVKGLTDLSACKAQLGLGHSYSRAKV 

KFKVNRVDNMnQSISIXDQLDKDINTFSMRVRE 

WYGYHFPELVKHNDNATYCRLAQFIGNRRELNE 

DKLEKLEELTMDGAKAKAILDASRSSMGMDISAI 

DUNIESFSSRVVSLSEYRQSLHTYLRSKMSQVAP 

SLSALIGEAVGARLIAHAGSLTNLAKYPASTVQIL 

GAEKAIJTIALKTRGNTPKYGLIFHSTFIGRAAAK 

NKGRISRYLANKCSIASRDDCFSEVPTSVFGEKLR 

EQVEERI^FYETGEIPRK3^DVMKEAMVQAEAE 
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SEQW 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

odd residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence <A«AIanine OCysteine, D=Aspartic Add, 
EXSIutamic Acid, F^-Pbenylalanine, Glycine, H=Histidinc, 
I^lsoleucine. K^Lvsine. Lr=Lencine. M^Methionlne. 
N-Asparagine, P=Proline, Q=G!utamine, R=»Arginine, S^Serine, 
"^Threonine, V«Valine, W^Tryptophan, Y»Tyrosine, 
X»Unknown, *«Stop codon, /^possible nucleotide deletion, 
\=possjbie nucleotide insertion 










EAAAEITRKLEKQEKKRLKKEKKRLAALALASS 

ENSSSTPEECEETSEKPKKKKKQKPQEVPQENGM 

EDPSISFSKPKKKKSFSKEELMSSDLEETAGSTSIP 

KRKKSTPKEETVT4DPEEAGHRSRSKXKRKFSKEE 

PVSSGPEEAVGKSSSKKKKKFHKASQBD 


3574 

« 


A 


284 


2032 


CGNERTARLWVQPVVSTMPQASEHRLGRTREPP 

VNIQPRVGSKLPFAPRARSKERRNPASGPNPMLR ; 

PLPPRPGLPDERLKKLELGRGRTSGPRPRGPLRA 

DHGVPLPGSPPPTVALPLPSRTNLARSKSVSSGDL 

RPMGIALGGHRGTGELGAALSRLALRPEPPTLRR 

STSLRRLGGFPGPPTLFSIRTEPPASHGSFHMISAR 

SSEPFYSDDKMAHHTLLLGSGHVGLRNLGNTCF 

LNAVLQCLSSTRPLRDFCLRRDFRQEVPGGGRA ! 

QELTEAFADVIGALWHPDSCEAVNPTRFRAVFQ 

KY VPSF SG YSQQDAQEFLKLLMERLHLEINRRGR 

RAPPILANGPVPSPPRRGGALLEEPELSDDDRANL 

MWKRYLEREDSKTVDLFVGQLKSCLKCQACGY 

RSTTFEVFCDLSLPIPKKGFAGGKVSLRDCFNLFT 

KEEELESENAPVCDRCRQKTRSTKKLTVQRFPRI 

LVLHLNRFSASRGSIKKSSVGVDFPLQRLSLGDF 

ASDKAGSPVYQLYALCNHSGSVHYGHYTALCR 

CQTGWHVYNDSRVSPVSENQVASSEGYVLFYQL 

MQEPPRCL 


3575 


A 


1 

• 

■ 


2408 

• 


RELDSLADLPERQCPPYANGLSTSHLRSSSVEDVK 

LUSEGRPTIEVRRCSMPSVICEHTKQFQTISEESN 

QGSLLTVPGDTSPSPKPEVFSNVPERDLSNVSNIH 

S SF ATSPTG ASNSKYVS ADRNLIKNTAPVNTVMD 

SPVHLEPSSQVGVIQNKSWEMPVDRLETLSTRDF 

ICPNSNIPDQESSLQSFCNSENKVLKENADFLSLR 

QTELPGNSCAQDPASFMPPQQPCSFPSQSLSDAES 

ISKHMSLSYVANQEPGELQQKNAVQnSSALDTD 

NESTKDTENTFVLGDVQKTDAFVPV Y SDSTIQEA 

SPNFEKAYTLPVLPSEKDFNGSDASTQLNTHYAF 

SKLTYKSSSGHEVENSTTDTQVISHEKENKLESL 

VLTHLSRCDSDLCEMNAGMPKGNLNEQDPKHC 

PESEKCLLSIEDEESQQSILSSLENHSQQSTQPEM 

HKYGQLVKVELEENAEDDKTENQIPQRMTRNK 

ANTMANQSKQILASCTLLSEKDSESSSPRGRIRLT 

EDDDPQIHHPRKRKVSRYPQPVQVSPSLLQAKEK 

TQQSLAAIVDSLKLDEIQPYSSERANPYFEYLHIR 

KKIEEKRKLLCSVTPQAPQYYDEYVTFNGSYLLD 

GNPLSKICIPTTIPPPSLSDPIJCEIJFRQQEVVRMKL 

RLQHSIEREKXIVSNEQEVIJR.VHYRAARTLANQT 

LPFSACTVLLDAEVYNVPLDSQSDDSKTSVRDRF 

NARQFMSWLQDVDDKFDKLKTCLLMRQQHEA 

AALNAVQRLEWQLKLQELDPATYKSISIYEIQEF 

YVPLVDVNDDFELTPI 


3576 


A 


5 


1421 


LRLAWHDGARWPLGTPRAAATRREAAALPPVT 

LALLCLDGVFLSSAENDFVHRIQEELDRFLLQKQ 

LSKVLLFPPLSSRLRYLIHRTAENFDLLSSFSVGE 

GWKItRTVICHQDIRVPSSDGLSGPCRAPASCPSR 

YHGPRPI SNQG AAA VPRG ARAGRWYRGRKPDQ 

PLYVPRVLRRQEEWGLTSTSVLKREAPAGRDPEE 

PGDVGAGDPNSDQGLPVLMTQGTEDLKGPGQR 

CENEPLLDPVGPEPLGPESQSGKGDMVEMATRF 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A°Alanine OCysteine, D=»Aspartic Acid, 
E-Glutamic Add, ^Phenylalanine, G=Glydne, EHBistidlne, 
I^lsoleudne, K=Lyslne, D-Leudne, M^Mctbtonlnt, 
N-Asparaglne, P^Proline, Q=Glutamlne, R=Arginine, S=Serine, 
T=Threonine, V=Vallne, W=Tryptophan, Y«Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=posslble nudeotide Insertion 




• 




« 


GSTLQLDLEKGKESLLEKRLVAEEEEDEEEVEED 

GPSSCSEDDYSELLQEITDNLTKKEIQIEBCIHLDTS 

SFMEELPGEKDLAHWEIYDFEPALKTEDLLATF 

SEFQEKGFRIQWVDDTHALGIFPCRASAAEALTR 

EFSVLHRPLTQGTKQSKLKALQRPKLLRLVKER 

PQTNATVARRLVARALGLQHKKKERPAVRGPLP 

P 


3577 


A 


102 

9 


1998 

» 


DTRTPGSLEMGPLQFRDVAIEFSLEEWHCLDTAQ 

RNLYRNVMLENYSNLVFLGIVVSKPDLIAHIJBQG 

KKPLTMKRHEMVANPSGPVICSHFAQDLWPEQN 

IKDSFQKVILRRYEKRGHGKLQLIKRCESVDECK 

VHTGGYNGLNQCSTTTQSKVFQCDKYGKVFHK 

FS^SNRHNlRHTCKJCPFKCIECGKAFNQFSTLrrH 

KKIHTGEKPYICEECGKAFKYSSALNTHKRIHTG 

EKPYKCDKCDKAFIASSTLSKHEIIHTGKKPYKCE 

ECGKAFNOSSTLTKHKKIHTGEKPYKCEECGKAF 

1^\/\JAm U A » JL JLm** m -m m. m, mm ^m-^m * m m m * m ^m, m m m-^^m^m^ ^^m mm mm 

NQSSTLTKHKKIHTGEKPYVCEECGKAFKYSRIL 

TTHKRIHTGEKP YKCNKCGKAFIAS STLSRHEF1H 

MGKKHYKCEECGKAFIWSSVLTRHKRVHTGEKP 

YKCEECGKAFKYSSTLSSHKRSHTGEKPYKCEEC 

GKAFVASSTLSKHE1IHTGKKPYKCEECGKAFNQ 

SSSLTKHKKIHTGEKPYKCEECGKAFNQSSSLIK 

HKKmTGEKPYKCEECGKAFNQSSTLIKraCKlHr 

REKPYKCEECGKAFHLSTHLTTHKILHTGEKPYR 

CRECGKAFNHSATLSSHKIOHSGEKPYECDKCG 

KAFISPSSLSRHEHHTGEKP 


3578 


A 


1725 


445 

• 


RPRRRGTHHFSCVLG SFRVS AMFPRVSTFLPLRP 

LSRHPLSSGSPETSAAAIMLLTVRHGTVRYRSSA 

LLARTKN>nORYFGTNSVlCSKJCDKOSVRTEETS 

KETSESQDSEKENTKKDLLGI1KGMKVELSTVNV 

RTTKPPKRRPLKSLEATLGRLRRATEYAPKKRIEP 

LSPELVAAASAVADSLPFDKQTTKSELLSQLQQH 

EEESRAQRDAKRPKISFSNHSDMKVARSATARV 

RSRPELRJQFDEGYDNYPGQEKTDDLKKRKNIFT 

GKRLNIFDMMAVTKEAPETDTSPSLWDVEFAKQ 

LATVT^QPLQNGFEELIQWTKEGKJLWEFPINNEA 

GFDDDGSEFHEHIFLEKHLESFPKQGPIRHFMELV 

TCGLSKNPYLSVKQKVEHIEWFRKYFNEKKDILK 

ESNIQFKLRPWKFLFRNN 


3579 


A 


1725 


445 


RPRRRGTHHFSCVLGSFRVSAMFPRVSTFLPLRP 

LSRHPLSSGSPETSAAAIMIXTVRHGTVRYRSSA 

LLARTKNNIQRYFGTNSVICSKICDKQSVRTEETS 

KETSESQDSEKENTKKDLLGIIKGMKVELSTVNV 

RTTKPPKRRPLKSLEATLGRLRRATEYAPKKRIEP 

LSPELVAAASAVADSLPFDKQTTKSELLSQLQQH 
EEESRAQRDAKRPKISFSNIISDMKVARSATARV 
RSRPELRIQFDEGYDNYPGQEKTDDLKKRKMFT 
GKRLMFDMMAVTKEAPETDTSPSLWDVEFAKQ 

LATVNEQPLQNGFEELIQWTKEGKLWEFPINNEA 
GFDDIXjSEFHEHIFLEKHLESFPKQGPIRHFMELV 
TCGI^K^YLSVKQKVEHIEWFRNYFNEKKDILK 
ESNIQFKLRPWKFLFRNN 


3580 


A 


3673 


1619 


LYCVAPYSRHLLGRMSHLPMKLLRKKDBKRNLK 
LRQRNLKFQGASNLTLSETQNGDVSEETMGSRK 
VKKSKQKPMNVGLSETQNGGMSQEAVGNIKVT 
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T SEQID 
NO: 


| Method 


1 Predicted 
1 beginning 

nucleotide 

location 

corresponding 
1 to first amino 
1 acid residue of 
| peptide 
1 sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A= Ala Dine OCysteine, D^Aspartic Add, 
E=G J atomic Add, ^Phenylalanine, G=Glycioe, EHOistidine, 
I-Isoleudne, KpLyslne, D=Leudne, M=Methlonine, 
N^Asparagine, P»Proline f Q=Glutamioe, R=Arglnine t S^Serine, 
T»Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=\Jnknown, *«Stop codon, /^possible nucleotide ddetion, 
V=possible nucleotide Insertion 






•j 




KSPQKST\^TNGEAAMQSSNSESKKKKKKJKRK 

MV>nDAEPDTKKAKTENKGKSEEESAETTKETEN 

NVEKPDNDEDESEVPSLPLGLTGAFEDTSFASLC 1 

NLVNENTLKADCEMGFTNMTEIQHKSIRPLLEGR 

DLLAAAKTGSGKTLAFLIPAVELIVKLRFMPKNG 

TGVLILSPTRELAMQTFGVLKELMTHHVHTYGLI 

MGGSNRSAEAQKLGNGINUVATPGRLLDHMQN 

TPGFMYKNLQCLVIDEADRILDVGFEEELKQIIKL 

LPTJEUIQTMLJSATQTRKVED1ARISLKKEPLYVG 

VDDDKANATVDGLEQGYWCPSEKRFLLLKli'L 

KKNRKKJG^MVFFSSCMSVKYHYELLNYIDLPVL 

AIHGKQKQNKJR.TTTFFQFCNADSGTLLCTDVAA 

RGLDIPEVDW1VQYDPPDDPKEYIHRVGRTARGL 

NGRGHALLILRPEELGFLRYLKQSKVPLSEFDFS 

WSKISDIQSQLEKLIEKNYFLHKSAQEAYKSYIRA 

YDSHSLKQIFNVNNLNLPQVALSFGFKVPPFVDL 

NVNSNEGKQKKRGGGGGFGYQKTKKVEKSKIF 

KHISKKSSDSRQFSH 


3581 


A 


23 


453 


LCRCICIKNITPHCLWDKVLSQFTYILDNLSNFMS 

HHPHSLRNSCLIRMDLLYWQFTIYTITFCFSPILSG 

RLTLSAQHISHRPCLLSYSLLFWKVHHLFLEGFPC 

SPRLDEMSFHQFPQHPVHVSWHLPIVYKGSMT 

QVSPH 


3582 


A 


3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEA 

GIEAVGS AAEEKGGLV SDAYGEDDFSRLGGDED 

GYEEEEDENSRQSEDDDSETEKPEADDPKDNTE 

AEKRDPQELVASFSERVRNMSPDEIKIPPEPPGRC 

SNHLQDKIQKXYERKiKEGMDMNYIIQRKKEFRN 

PSIYEKLIQFCA1DELGTNYPKDMFDPHGWSEDS 

YYEALAKAQKIEMDKLEKAKKERTXJEFVTGTK 

KGTTTNATSTTTTTASTAVADAQKRKSKWDSAI 

PVTTIAQPTILTTTATLPAVVTVTTSASGSKTTV1S 

AVGTIVKKAKQ 


3583 


A 


3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEA 

GIEAVGSAAEEKGGLVSDAYGEDDFSRLGGDED 

GYEEEEDENSRQSEDDDSETEKPEADDPKDNTE 

AEKRDPQELVASFSERVRNMSPDEDCIPPEPPGRC 

SNHLQDKIQKLYERKJKEGMDMNYIIQRKKEFRN 

PSIYEKLIQFCAEDELGTNYPKDMFDPHGWSEDS 

YYEALAKAQKIEMDKT .EKAKKERTKIEFVTGTK 

KGTTTNATSTTTTTASTAVADAQKRKSKWDSAI 

PVTTIAQPTILTTTATLPAWTVTTSASGSKTTVIS 

AVGTIVKKAKQ 


3584 


A 


3 


1139 


PGSTISSRADRLGAPVLAHPKMAERQEEQRGSPP 

LRAEGKADAEVKLILYHWTHSFSSQKVRLVIAE 

KALKCEEHDVSLPLSEHNEPWFMRLNSTGEVPV 

LIHGENIICEATQIIDYLEQTFLDERTPRLMPDKES 

MYYPRVQHYRFJ J J>SLPMDAYTHGCILHPELTV 

DSMBPAYATTRIRSQIGNTESELKKLAEENPDLQE 

AYIAKQKRLKSKLLDHDKVTKYLKKJLDELEKVL 

DQVETELPRKNEETPEEGQQPWLCGESFTLADVS 

LAVTLHRLKFLGFARRNWGNGKRPKLETYYERV 

LKRKTFNKVLGHVNNILISAVLPTAFRVAKKRAP 

KVLGTTLWGLLAGVGYFAFM1JFRKRLGSMILA 

LRPRPNYF 
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SEQID 
NO: 


Method 


| Predicted 
1 beginning 
1 nucleotide 

location 

corresponding 
1 to first amino 
1 acid residue of 
1 peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanin* OCysteine, D=>Aspartic Add, 
E-GJutamlc Acid, ^Phenylalanine, G^GIydne, H«Histidine, 
Mso leu cine, KpLysine, LHLe urine, MMVlethionlne, 
N«Asparagine, P»Proline, Q=Gltitflmine, R^Arginine, S=Scrine, 
T-Threonine, V»Valine, W-Tryptopnan, Y=Tynwine, 
X=Un known, *«Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 


3585 


A 


1 


1777 

• 


RRHSPGSPAFAPSSRATAICPRAARAPATLLLALG 

AVLWPAAGAWELTLLHTNDVHSRLEQTSEDSSK 

CVNASRCMGGVARLFTKVQQIRRAEPNVLLLDA 

GDQYQGTIWFnTYXGAEVAHFMNALRYDAMA 

LGNHEFDNGVEGL1EPLLKEAKFP1LSANIKAKGP 

LASQISGLYLPYKVLPVGDEWGIVGYTSKETPF 

I^NPGimWEDEITALQPEVDKLKTLKVNKIIAL 

GHSGFEMDKXIAQKVRGVDVVVGGHSNTFLYT 

GNPPSKEVPAGKYPFrVTSDDGRKVPWQAYAF 

GKYLGYLKIEFDERGNVISSHGNPILLNSSIPEDPS 

KADINKWIUKLDNYSTQELGKTIVYLDGSSQSC 

RFRECNMGNLICDANIINNNLRHTDEMFWNHVS 

MCILNGGGIRSPIDERNNGTITWENLAAVLPFGG 

TFDLVOLKGSTLKKAFEHSVHRYGQSTGEFLQV 

GGIHVVYDLSRKPGDRVVKLDVLCTKCRVPSYD 

PLKMDEVYKV1LPNFLANGGDGFQMIKDELLRH 

DSGDQDINVVSTYISKMKVrYPAVEGRIKFSTGS 

HCHGSFSLIFLSLWAVIFVLYQ 


3586 


A 


1399 


881 


LSNKDVLSPQLKDENSKLRRKLNEVQSFSEAQTE 

MVRTLERKLEAKMIKEESDYHDLESVVQQVEQN 

LELMTKRAVKAENHVVKJLKQEISLLQAQVSNFQ 

RENEAIJICGQGASLTVVKQNADVALQNLRVVM 

NSAQASIEQLVSGAETLNLVAEILKSIDRISEVKD 

EEEDS 


3587 


A 

* 


88 


1639 


GCVGRGLPLPPRHPTPPSSSSSPFVLLAFLLLVRL 

DPAVSGKMAAPRPPPARLSGVMVPAPIQDLEAL 

RALTALFKEQRNRETAPRTIFQRVLDILKKSSHA 

VELACRDPSQVENLASSLQLITECFRCLRNACDBC 

SVNQNSIRNLDTIGVAVDLILLFRELRVEQESLLT 

AFRCGLQFLGNIASRNEDSQSIVWVHAFPELFLS 

CLNHPDKKIVAYSSMILFTSLNHERMKELEENLN 

IAIDVDDAYQKHPESEWPFLIITDLFLKSPELVQA 

MFPKLNNQERVTLLDLMIAKITSDEPLTKDDIPVF 

LRHAELIASTFVDQC3CTVLKLASEEPPDDEEALA 

TIRIXDVLCEMTVNTELLGYLQVFPGLLERVIDL 

LRVIHVAGKETTNIFSNCG CVRAEGDISNV ANGF 

KSHLIRLIGNLCYKNKDNQDKVNELDGIPLILDN 

CMSDSNPFLTQWVIYAIRNLTEDNSQNQDLIAK 

MEEQGLADASLLKKVGFEVEKKGEKtlLKSTRD 

TPKP 


3588 


A 


3 


1462 

• 


DSPRNRFEILGRPTRTPTRPGPRPAMEDLDALLSD 

LETTTSHMPRSGAPKERPAEPLTPPPSYGHQPQT 

GSGESSGASGDKJDHLYSTVCKPRSPKPAAPAAPP 

FSSSSGVLGTGLCELDRLLQELNATQFN1TDEIMS 

QFPSSKVASGEQKEDQSEDKKRPSLPSSPSPGLPK 

ASATSATLELDRLMASLSDFRVQNHLPASGPTQP 

PWSSTNEGSPSPPEPTGKGSLDTMLGLLQSDLSR 

RGVPTQAKGLCGSCNKPIAGQWTALGRAWHPE 

HFVCGGCSTALGGSSFFEKDGAPFCPECYFERFSP 

RCGFCNQPIRHKMVTALGTHWHPEHFCCVSCGE 

PFGDEGFHEREGRPYCRRDFLQLFAPRCQGCQGP 

ILDNYISALSALWHPDCFVCRECFAPFSGGSFFEH [ 

EGRPLCENHFHARRGSLCATCGLPVTGRCVSAL 

GIU^FHPDHFTCTFCLRPLTKGSFQERAGKPYCQP 

CFLKLFG 
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SEQiD | Method 
NO: 



Predicted 

beginning 

nucleotide 

locution 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
lo last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A«Alanine C=Cysteine, D=Aspartic Acid, 
E«=Glutaraic Add, P-Phenylalanine, G-Glydne, H=HJstidine, 
I—Isoleocine, K=-Lysine, L»Leudne, M=-Methionlne, 
N^Asparagiae, P^Proline, Q=G)utamine, R-Arginine, S^Serine, 
T°Tbreonine, V« Valine, W«Tryptopban, Y-Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nudeotide insertion 



3589 



226 



6793 



SPPKKSRKCmSFRLISAERWRFFLLILMEMPRKP 

RLTLFVQRRIENIATEREFDPEEFYYLLEAAEOHA 

KEGQGDCTDIPRYIISQLGLNKDPLEEMAHLGNY 

DSGTAETPETDESVSSSNASLKLRRKPRESDFETI 

KLISNGAYGAVYFVRHKESRQWAMKKINKQNL 

1LRNQIQQAFVERDILTFAENPFWSMYCSFETRR 

HLCNTVMEYVEGGDCATLMKNMGPLPVDMARM 

YFAETVl.ALEYLH^GIVHRDLia>DNLLVTSMG 

HIKLTDFGLSKVGLMSMTTNLYEGHIEKDAREFL 

DKQVCGTPEYIAPEVILRQGYGKPVDWWAMGII 

LYEFLVGCVPFFGDTPEELFGQVISDEINWPEKDE 

APPPDAQDLITLLLRQNPLERLGTGGAYEVKQHR 

FFRSLDWNSLLRQKAEFIPQLESEDDTSYFDTRSE 

KYrniMETEEEDDTNDEDFNVEERQFSSCSHRFSK 

VFSSIDRITQNSAEEKEDSVDKTKSTTLPSTETLS 

WSSEYSEMQQLSTSNSSDTESNRHKLSSGLLPKL 

AISTEGEQDEAASCPGDPHEEPGKPALPPEECAQ 

EEPEVTTPASTISSSTLSVGSFSEHLDQINGRSECV 

DSTDNSSKPSSEPASHMARQRLESTEKKKISGKV 

TKSLSASALSLMIPGDMFAVSPLGSPMSPHSLSSD 

PSSSRDSSPSRDSSAASASPHQPIVfflSSGKNYGFT 

IRAmVWGDSDIYTVHHWWNVEEGSPACQAGL 

KAGDLITHINGEPVHGLVHTEVEELLLKSGNKVSI 

TTTPFENTS1KTGPARRNSYKSRMVRRSKKSKKK 

ESLERRRSLFKKLAKQPSPLLHTSRSFSCLNRSLS 

SGESLPGSPTHSLSPRSPTPSYRSTPDFPSGTNSSQ 

SSSPS SS APNSPAGSGHIRPSTLHGLAPKLGGQRY 

RSGRRKSAGNIPLSPLARTPSPTPQPTSPQRSPSPL 

LGHSLGNSKIAQAFPSKMHSPPTIVRHIVRPKSAE 

PPRSPLLKRVQSEEKLSPSYGSDKKHLCSRKHSL 

EVTQEEVQREQSQREAPLQSLDENVCDVPPLSRA 

RPVEQGCLKRPVSRKVGRQESVDDLDRDKLKAK 

VWKKADGFPEKQESHQKFHGPGSDLENFALFK 

LEEREKKVYPKAVERSSTFENKASMQEAPPLGSL 

LKDALHKQASVRASEGAMSDGPVPAEHRQGGG 

DFRRAPAPGTLQDGLCHSLDRGISGKGEGTEKSS 

QAKELOICEKLDSKLANIDYLRKKMSLEDKEDN 

LCPVLKPKMTAGSHECLPGNPVRPTGGQQEPPPA 

SESRAFVSSTHAAQMSAVSFVPLKALTGRVDSGT 

EKPGLVAPESPVRKSPSEYKLEGRSVSCLEPIEGT 

LDIALLSGPQASKTELPSPESAQSPSPSGDVRASV 

PPVLPSSSGKKNDTTSARELSPSSLKMNKSYLLEP 

WFLPPSRGLQNSPAVSLPDPEFKRDRKGPHPTAR 

SPGTVMESNPQQREGSSPKHQDHTTDPKLLTCLG 

QNLHSPDLARPRCPLPPEASPSREKPGLRESSERG 

PPTARSERSAARADTCREPSMELCFPETAKTSDN 

SKM-LSVGRTHPDFYTQTQAMEKAWAPGGKTN 

HKDGPGEARPPPRDNSSLHSAGIPCEKELGKVRR 

GVEPKPEALLARRSLQPPGIESEKSEKLSSFPSLQ 

KDGAKEPERKEQPLQRHPSSIPPPPLTAKDLSSPA 

ARQHCSSPSHASGREPGAKPSTAEPSSSPQDPPKP 

VAAHSESSSHKPRPGPDPGPPKTKHPDRSLSSQK 

PSVGATKGKEPATQSLGGSSREGKGHSKSGPDVF 

PATPGSQ>IKASDGIGQGEGGPSVPLHTDRAPLDA 

KPQPTSGGRPLEVLEKPVHLPRPGHPGPSEPADQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
^nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine C=Cystcine, D»Aspartic Acid, 
E~Glntamic Add, ^Phenylalanine, OGrycine, HHEUstidine, 
I»Isoleuclne, K=Lysine, IHLeucine, M=Metuionine, 
N-Asparagine, P=Proline, Q^lutamine, R=Argini nc, S=Serioe, 
T«Threonine t V-Valine, W«Tryptophan, Y=Tyrosine, 
X e Un known, *oStop cod on, A-possible nucleotide deletion, 
V=possible oudeotide insertion 








« 


KLSAVGEKQTLSPKHPKPSTVKDCPTLCKQTDN 

RQTDKSPSQPAANTDRRAEGKKCTEALYAPAEG 

DKLEAGLSFVHSENRLKGAERPAAGVGKGFPEA 

RGKGPGPQKPPTEADKPNGMKRSPSATGQSSFRS 

TALPEKSLSCSSSFPETRAGVREASAASSDTSSAK 

AAGGMLELPAPS>nUDHRKAQPAGBGRTHMTKS 

DSIJPSFRVSTLPLESHHPDPNTMGGASHRDRALS 

VTATVGETKGKDPAPAQPPPARKQNVGRDV1XP 

SPAPNTDRPISLSNEKDFWRQRRGKESLRSSPHK 

KAL 


3590 


A 


3 


935 


RATTRPKNEVQDYVSVEYLSPHMGGTDPFKYSY 

PPLVDDDFQTPLCENGPITSEDETSSKEDIESDGK 

ETLETISNEEQTPLLKKINPTESTSKAEENEKVDS 

KVKAFKKPLSVFKGPLLfflSPAEELYFGSTBSGEK 

KTLIVLTNVTKNIVAFKVRTTAPEKYRVKPSNS S 

CDPGASVDIWSPHGGLTVSAQDRFLIMAAEME 

QSSGTGPAELTQFWKEWRNKVMEHRLRCHTVE 

SSKPNTLTLKDNAFNMSDKTSEDICLQLSRLLES 

NRKLEDQVQRCIWFQQLLLSLTMLLLAFVTSFFY 

LLYS 


3591 


A 


303 


2 


GGS WGPLCPV SPAMSLSDPGLG YHPTC WTLR WP 

PLCSLHALHVFHCLFSSRLGTPVSPRLAMDPNCS 

CEAGGSCACAGSCKCKKCKCTSCKKSCCSCCPL 


3592 


A 


1052 


1779 


GKTMMRKMLLAAALSVTAMTAHADYQCSVTP 

RDDVIVSPQTVQVKGENGNLVITPDGNVMYNGK 

QYSLNAAQREQAKDYQAELRSTLPWIDEGAKSR 

VEKARIALDKIIVQEMGESSKMRSRLTKLDAQVK 

EQMNRJIETRSDGLTFHYKAJDQVRAEGQQLVNQ 

AMGGILQDSINEMGAKAVLKSGGNPLQNVLGSL 

GGLQS SIQTEWKKQEKDFQQFGKD VCSR WTLE 

DSRKALVGNLK 


3593 


A 

» 


3 


1837 


LSFEKVDIQTDNDLTKEMYEGKENVSFELQRDFS 

QETDFSEASLLEKQQEVHSAGN1KKEKSNTIDGT 

VKDETSPVEECFFSQSSNSYQCHTITGEQPSGCTG 

LGKSISFDTKLVKHEIINSEERPFKCEELVEPFRCD 

SQLIQHQENNTEEKPYQCSECGKAFSINEKLIWH 

QRLHSGEKPFKCVECGKSFSYSSHYITHQTIHSGE 

KPYQCKMCGKAFSVNGSLSRHQRJHTGEKPYQC 

KECGNGFSCSSAYITHQRVHTGEKPYECNDCGK 

AFNGNAKLIQHQRIHTGEKPYECNECGKGFRCSS 

QLRQHQSIHTGEKPYQCKECGKGFNNNTKLIQH 

QRIHTASLAEQLFKASGNHPNWGCCLTISSPGPS 

VYGPKMNMRGAPN SRLAGGREKRTQDTDFGQC 

SFLPSHSPSCFEPWNVTDYDSSWYRQKQVLSGV ! 

WSSPLSILKLPRTLIRISimQEMDTPGEMLMTGR 

GSLGPTLTTEAPAAAQPGKQGPPGTGRCLQAPGT 

EPGEQTPEGARELSPLQESSSPGGVKAEEEQRAG 

AEPGTRPSLARSDDNDHEVGALGLQQGKSPGAG 

NPEPEQDCAARAPVRAEAVRRMPPGAEAGSWL 

DD 


3594 


A 


39 


261 


RAAMMDTSRVQPIKLAJVDCVLGRTG SQGQCTQ 

VRVEFMDDTSRS1IRSVKGPVREGDVLTLLESERE 

ARRLR 


3595 


A 


973 


68 


GRVGTKHQMADDAGAAGGPGGPGGPGMGKRG 
GFRGGFGSGIRGRGRGRGRGRGRGRGARGGKAE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue oi 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alaniae OCysteine, D=Aspartic Acid, 
^Glutamic Acid, ^Phenylalanine, OGIycine, H=Histidine, 
I=l5oleucine, K=Lysine, L^Leucioe, M=Methionlne, 
N^Asparaginti P=Prollne, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Vallne, W«Tryptophan, Y=Tyrosine, 
X=Unbnown, *=Stop codon, A=possible nucleotide deletion, 
V=possible nucleotide insertion 










DKEWMPVTKJLGRLVKX>MKIKSLEEIYLFSLPIKE 
SEIIDFFLGASLKDEVLKIMPVQKQTRAGQRTRF 
KAFVAIGDYNGHVGLGVKCSKEVATAIRGAnLA 
KLSIVPVRRGYWGNK1GKPHTVPCKVTGRCGSV 
LVRLXPAPRGTGIV SAP VPKKJLLMMAGIDE>C YTS 
ARGCTATLGNFAKATFDAISKTYSYLTPDLWKE 
TVFITCSPYQEFTDHLVKTHTR VS VQRTQAPA VA 
TT 


3596 

• 


A 


106 

• 


2960 


DERRVGAADMFGRSRSWVGGGHGKTSRNIHSL 

DHLKYLYH\^TKNTTVTEQNRNLLVETIRSITEIL 

IWGDQNDSSVFDFFLEKNMFVFFLNILRQKSGRY 

VCVQLLQTLNILFENISHETSLYYLLSNNYVNSn 

VHKFDFSDEEIMAYYISFLKTLSLKLNNHTVHFF 

YNEHTNDFALYTEAIKFFNHPESMVRIAVRTIT^ 

NVYKVSLDNQAMLHYIRDKTAVPYFSNLVWFIG 

SHVIELDDCVQTDEEHRNRGKLSDLVAEHLDHL 

HYLNDILIINCEFLNDVLTDHLLNRLFLPLYVYSL 

ENQDKGGERPKISLPVSLYLLSQVFLIIHHAPLVN 

SLAEVILNGDLSEMYAKTEQDIQRSSAKPSIRCFI 

KFlKlLERSLEMhnCHKGKRRVQKRPNYKNVGEE 

EDEEKGPTBDAQEDAEKAKGTEGGSKGIKTSGES 

EEIEMVIMERSKLSELAASTS VQEQNT1 DEEKSA 

AATCSESTQWSRPFLDMVYHALDSPDDDYHALF 

VLCLLYAMSHNKGMDPEKLERIQLPVPNAAEKT 

TYNHPLAERLIRIMKNAAQPDGKIRLATLELSCL 

IXKQQVLMSAGCIMKDVHLACLEGAREESVHLV 

RHFYKGEDIFLDMFEDEYRSMTMKPMNVEYLM 

MDASILLPPTGTPLTGIDFVKRLPCGDVEKTRRAI 

RVFFMLRSLSLQLRGEPETQLPLTREEDLEKTDDV 

LDLNNSDL1ACTVITKDGGMVQRSLAVDIYQMS 

LVEPDVSRLGWGWKFAGLLQDMQVTGVEDDS 

RALNITIHKPAS SPHSKPFPILQ ATFIFSDHIRCI1AK 

QRLAKGJUQARRMKMQRIAALLDLPIQPTTEVLG 

FGLGSSTSTQHLPFRFYDQGRRGSSDPTVQRSVF 

ASVDKVPGFAVAQCINEHSSPSLSSQSPPSASGSP 

SGSGSTSHCDSGGTSSSSTPSTAQSPAGIGHVTQ 


3597 


A 


427 


277 


GVRRIQHHWAQMHECm^HTYASLFCLFLLHTG 
KLCCLNSHRHFHCIKYSK 


3598 


A 


1 


503 


FRPRTKKATAMYLEHYLDS1ENLPCELQROTQL 

MRELDQRTEDKKAEIDILAAEYISTVKT1-SPDQR 

VERLQKIQNAYSKCKEYSDDKVQLAMQTYEMV 

DKHn^RLDADLARFEADLKDKMEGSDFESSGGR 

GLKKGRG QKEKRGSRGRGRRTSEEDTPKKKKH 

KGG 


3599 


A 


2 


3907 


KTITALAFSPDGKYLVTGESGHMPAVRVWDVAE 

HSQVAELQEHKYGVACVAFSPSAKYTVSVGYQH 

DMIVNVWAWKKNIVVASNKVSSRVTAVSFSED 

CSYFVTAGNRHDCFWYLDDSKTSKVNATVPLLG 

RSGLLGELRNNLFTDVACGRGKKADSTFCITSSG 

LLCEFSDRRLLDKWVELRVYPEVKDSNQACLPP 

SSFTTCSSDKnRLWOTESSGVHGSTLHRNILSSDL 

IK1IYVDGNTQALLDTELPGGDKADASLLDPRVGI 

RSVCVSPNGQHLASGDRMGTLRVHELQSLSEML 

KVEAHDSEILCLEYSKPDTGLKLLASASRDRLIH 

VLDAGREYSLQQTLDEHSSSITAVKFAASDGQVR 
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SEQW 
NO: 


[ Method 


| Predicted 
j beginning 

nucleotide 
| location 

corresponding 
| to first amino 
I acfd residue of 

peptide 
j sequence 


I Predicted end 
nucleotide 

| location 
corresponding 
to last amino 
acid residue of 

1 peptide 
sequence 


Amino acid sequence (A-Alaoine C=Cystelne, D= As panic Add, 
E>=GlutamIc Acid, ^Phenylalanine, G^lycine, HHHQstidine, 
I=Isoleucine, K«Lysine, L^Leuclne, M=Methionine, 

As pa ragi n e, P=Proli a e, 0=G)utamine, R=Arginine, S«Serine, 
T^Tnreoulne, V«Vallne, W=0*ryptophan, Y-Tyroslne, 
X~Un known, *=Stop eodon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 








- 


KDSCGADKSJYFRTAQKSGDGVQFTRTHHVVRK 

TTLYDMDVEPSWKYTAIGCQDKNIRIFNISSGKQ 

KKLFKGSQGEDGTLIKVQTDPSGIYIATSCSDKNL 

SIFDFSSGECVATMFGHSEIVTGMICFSNDCKHLIS 

VSGDSCIFVWRLSSEMTISMRQRLAELRQRQRGG 

KQQGPSSPQRASGPNRHQAPSMLSPGPALSSDSD 

KEGEDEGTEEELPALPVLAKSTKKALASVPSPAL 

PRSLSHWEMSRAQESVGFLDPAPAANPGPRRRG 

RWVQPGVELSVRSMU5UIQLETLAPSLQDPSQD 

SLAHPSGPRKHGQEALETSLTSQNEKPPRPQASQ 

PCSYPHORLLSQEEGVFAQDLEPAPIEDGIVYPEP 

SDNPTMDTSEFQVQAPARGTLGRVYPGSRSSEK 

HSPDSACSVDYSSSCLSSPEHPTEDSESTCPLSVD 

GISSDLEEPAEGDEEEEEEEGGMGPYGLQEGSPQ 

TPDQEQFLKQHFFrLASGAAPGAPVQVPERSESR 

SISSRFLLQVQTRPLREPSPSSSSLALMSRPAQVPQ 

ASGEQPRGNGANPPGAPPEVEPSSGNPSPQQAAS 

VLLPRCRLNPDSSWAPKRVATASPFSGLQKAQS 

VHSLVPQERHEASLQAPSPGALLSREIEAQDGLG 

SLPPADGRPSRPHSYQNPTTSSMAKISRSISVGEN 

LGLVAEPQAHAPIRVSPLSKLALPSRAHLVLDIPK 

PLPDRPTLAAFSPVTKGRAPGEAEKPGFPVGLGK 

AHSTTERWACLGEGTTPKPRTECQAHPGPSSPCA 

QQLPVSSLFQGPENLQPPPPEKTPNPMECTKPGA 

ALSQDSEPAVSLEQCEQLVAELRGSVRQAVRLY 

HSVAGCKMPSAEQSRIAQLLRDTFSSVRQELEAV 

AGAVLSSPGSSPGAVGAEQTQALLEQYSELLLRA 

VERRMERKL 


3600 


A 


1688 j 


916 


IPGSTISCSMALCEAAGCGSALLWPRLLLFGDSIT 

QFSFQQGGWGASLADRLVRKCDVLNRGFSGYN 

TR WAKDGLPRLIRKGN SLDIPV A VTTFFG AND SAL 

KDENPKQHIPLEEYAANLKSMVQYLKSVDIPENR 

VILriPrPLCETAWEEQCIIQGCKLNRLNSWGEY 

ANACLQVAQDCGTDVLDLWTLMQDSQDFSSYL 

SDGLHLSPKGNEFLFSHLWPLIEKKVSSLPLLLPY 

WRDVAEAKPELSLLGDGDH 


3601 


A 


44 


223 


VHFPLIPQLAKCFWTMNRAARNKSEKRYYSEFL [ 
QIAHLFNYGLSSFLREFOFLIKLLQ 


3602 


A 


37 


1124 


VPKPASGKRRLEFRPQDSKACAATPHSPGRJTSR 

TRGSQKVRSVPPRLPWAQASASTDWEGLRGVPG 

PALRRENFLEAAASGRSGRTPTGGVGFRDVGGP 

HFPIITAAHFLWCNLHTPRRPACNAPWHSPVGEI 

SPPPRESQLRRDPEVHFESPAHPLGFRLLPGRGLP 

ANAVTVETAAMAAPRQIPSHIVRLKPSCSTDSSF 

TRTPVPTVSIASRELPVSSWQVTEPSSKNLWEQI 

CKEYEAEQPPFPEGYKVKQEPVITVAPVEEMLFH 

GFSAEHYFPVSHFTMISRTPCPQDKSETINPKTCS 

PKEYLETFIFPVLLPGMASLLHQAKKEKCFEVVL 

QMTPSGGKACVWGHLPSSSHTI 


3603 


A 


286 


587 


NISNKAEVSSHPSVISHSMDSFGQPRPEDNQSVLR 

RMQKKYWKlXQVFIKATGKKEDEHLVASDAEL 

DAKLEVFHSVQETCTELLKJIEKYQLRLNGMKS 


3604 


A 


103 


2440 


QPRRRVFPAAGRGPGRKCSQWGRQASVSFEDVT 

VDFSKEEWQHLDPAQRRLYWDVTLENYSHLLS 

VGYQIPKSEAAFKLEQGEGPWMLEGEAPHQSCS 
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SfcQID | Method 
NO: 



Amino odd sequence (A^AIanine Q=Cysteine, D^Aspartlc Acid, 
{^Glutamic Add, F«Phenyla1anine» G=Glydne, H=Hisudine, 
I=*Isoleucine, K«Lysine, L^Leocinc, M=Methlonint, 
N»Asparagine, P=Pro!lne, Q=Glutamlne, R^Arginine, S=S trine, 
T^Tbreonine, V=Valine, W=Tryptophan, Y«Tyrosine, 
X=Unkno\vn } *«Stop eodon, /^possible nucleotide deletion, 
possible nucleotide insertion 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



GEArGKMQQQGIPGGIFFHCERFDQPIGEDSLCSI 

LEELWQDNDQLEQRQENQNNLLSHVKVLIKERG 

YEHKNIEKIIHVTTKLWSIKRL 

SHNHNRNSATKNLGKIFGNGNWPHSPSSTKNEN 

AKTGANSCEHDHYEKHLSHKQAPTHHQKIHPEE 

KLYVCTECVMGFTQKSHLFEHQRIHAGEKSREC 

DKSNKVFPQKPQVDVHPSVYTGEKPYLCTQCGK 

VFTLKSNLITHQKIHTGQKPYKCSECGKAFFQRS 

DLFRHLRfflTGEKPYECSECGKGFSQNSDLSfflQ 

KTHTGEKHYE(^ECGKAFTRKSALRMHQRIHTG 

EKPYVCADCGKAFIQKSHFNTHQRIHTGEKPYEC 

SDCGKSFTKKSQLHVHQRIHTGEKPYICTECGKV 

FTHRThnLTTHQKTHTGEKPYMCAECGKAPTDQS 

NLIKHQKTHTGEKPYKCNGCGKAHWKSRLKIH 

QKSmGERHYECKDCGKAnQKSTLSVHQRIHTG 

EKPYVCPECGKAHQKSHFIAHHRIHTGEKPYECS 

DCGKCrTKKSQLRVHQKJHTGEKPNlCAECGKAF 

TDRSNLITHQKIHTREKPYECGDCGKTFTWKSRL 

NIHQKSHTGERHYECSKCGKAFIQKATLSMHQII 

HTGKKPYACTECQKAFTORSNLIKHQKMHSGEK 

RYKASD 



3605 



322 



SFRMSGRGKGGKGLGKGGAKRHRKVLRDNIQGI 
TKPAIRRLARRG GVKRIS GLI YEETRG VLKVFLEN 
VIRD A VTYTEHAKRKTVTAMD VVY A LKRQGRT 
LYGFGG 



3606 



1749 



VPVTAEAKLMGFTQGCVTFEDVA1YFSQEEWGL 

LDEAQRLLYRDVMLENFALITALVCWHGMEDE 

ETPEQSVSVEGVPQVRTPEASPSTQKIQSCDMCV 

PFLTD1LHLTDLPGQELYLTGACAVFHQDQKHHS 

AEKPLESDMDKA SFVQCCLFHESGMPFTSSEVG 

KDFLAPLGILQPQAIANYEKPNKISKCEEAFHVGI 

SHYKWSQCRRESSHKHTFFHPRVCTGKKLYESS 

KCGKACCCECSLVQLQRVHPGERPYECSECGKS 

FSQTSHLNDHRRIHTGERPYVCGQCGKSFSQRAT 

LIKHHRVHTGERPYEC GECGKSF SQ S SNLIEHCRI 

HTGERP YECDECGKAFG SKSTL VRHQRTHTGEK 

PYECGECGKLFRQSFSLWHQRJHTTARPYECGQ 

CGKSFSLKCGLIQHQLfflSGARPFECDECGKSFSQ 

RTTLNKHHKVHTAERPYVCGECGKAFMFKSKL 

VRHQRTHTGERPFECSECGKFFRQSYTLVEHQKJ 

HTGLRPYDCGQCGKSFIQKSSLIQHQWHTGERP 

YECGKCGKSFTQHSGLILHRKSHTVERPRDSSKC 

GKPYSPRSNTV 



3607 



92 



331 



AMAGPGPGPGDPDEQYDFLFKLVLVGDASVGKT 
CVVQRFKTGAFSERQGSTIGVDFTMKTLEIQGKR 
VKLQIWDTAGQER 



3608 



545 



379 



AIKGYIHLSAPRNRYMHTTASNGRMLFMKVTM 
YMRRGVQIMGWSVRMAFMACFTQ 



3609 



118 



873 



VWMAWQVSLLELEDRLQCPICLEVFKESLMLQC 

GHSYCKGCLVSLSYHLDTKVRCPMCWQWDGS 

SSLPhTVSLAWVIEAIJlLPGDPEPKVCVHHKNPLS 

LFCEKDQELICGLCGLLGSHQHHPVTPVSTVCSR 

MKEELAALFSELKQEQKKVDEL1AKLVKNRTRJV 

NESDVFSWVIRREFQELRHPVDEEKARCLEGIGG 

HTRGLVASLDMQLEQAQGTRERLAQAECVLEQF 
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SEQ1D 
NO: 


Mclbod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D=Aspartic Acid, 
E»GIotamie Acid, ^Phenylalanine, G -Glycine, H=HJstidlnc, 
I=Ibo leucine, K«Lysine, L*=Leuclne, M-Methionine, 
N=Asparagine, P-Proline, Q=Glutamlne, R-Arginint, S=Serine, 
T^Threonine, V-Valine, W=Tryptophan, YB^Tyrosine, 
X«*Unknown, *«Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










GNEDHHEFIWKFHSMASR 


3610 


A 


2 


987 


DPRVRPPLLQPPPPLLPRLVILKMAPLDLDKYVEI 

AJRLCKYLPENDLKRLCDYVCDLLLEESNVQPVS 

TP VTVCGDIHG QFYDLCELFRTGGQ VPDTNYIFM 

GDFVDRGYYSLETFTYLLALKAKWPDRITLLRG 

NHESRQITQVYGFYDECQTKYGNANAWRYCTK 

VFDMLTVAALIDEQILCVHGGLSPDIKTLDQIRTI 

ERNQEIPHKGAFCDLVWSDPEDVDTWAISPRGA 

GWLFGAKVTNEFVHTNNLKLICRAHQLVHEGYK 

FMFDEKLVTVWSAPNYCYRCGNIASIMVFKDVN 

TREPKLFRAVPDSERVIPPRTTTPYFL 


3611 


A 


245S| 


869 


AEKMTAELREAMALAPWGPVKVKKEEEEEENF 

PGQASSQQVHSENIKVWAPVQGLQTGLDGSEEE 

EKGQNISWDMAWLKATQEAPAASTLGSYSLPG 

TLAKSEn.ETHGTMNFLGAETKNLQLLVPKTEIC 

EEAEKPLHSERIQKADPQGPELGEACEKGNMLK 

RQRIKJfUEKXDFRQVTVNDCHLPESFKEEENQKCK 

KSGGKYSLNSGAVKNPKTQLGQKPFTCSVCGKG 

FSQSANLWHQRIHTGEKPFECHECGKAFIQSAN 

L WHQRIHTGQKPYVCSKCGKAFTQS SNLTVHQ 

KIHSI^EKTFKCNECEKAFSYSSQLARHQKVHTTE 

KCYECNECGKI'KIRSSNLIVHQRIHTGEKPFACN 

DCGKAFTQSANLIVHQRSHTGEKPYECKECGKA 

FS CFSHLI VHQRIHTAEKP YDCSECGKAFS QLSCL 

IVHQRIHSGDLPYVCNECGKAFTCSSYLLIHQRIH 

NGEKPYTCNECGKAFRQRSSLTVHQRTHTGEKP 

YECEKCGAAFISNSHLMRHHRTHLVE 


3612 


A 


318 


2245 

■ 


SPMAEAALVNTPQIPMVTEEFVKPSQGHVTFEDI 

AVYFSQEEWGLLDEAQRCLYHDVMLENFSLMA 

SVGCLHGIEAEEAPSEQTLSAQGVSQARTPKLGP 

SIPNAHSCEMCILVMKDILYLSEHQGTLPWQKPY 

TSVASGKWFSFGSNLQQHQNQDSGEKHIRKEESS 

ALLLNSCKXPLSDNLFPCKDVEKDFPTILGLLQHQ 

TTHSRQEYAHRSRETFQQRRYKCEQVFNEKVHV 

TEHQRVHTGEKAYKRREYGKSLNSKYLFVEHQR 

THNAEKPYVCMCGKSFLHKQTLVGHQQRIHTRE 

RSYVCIECGKSLSSKYSLVEHQRTHNGEKPYVCN 

VCGKSFRHKQTFVGHQQRIHTGERPYVCMECGK 

SFIHSYDRIRHQRVHTGEGAYQCSECGKSFIYKQ 

SLLDHHRIHTGERPYECKECGKAFIHKXRLLEHQ 

RIHTGEKPYVCnCGKSFIRSSDYMRHQRIHTGER 

AYECSDCGKAnSKQTLLKHHKIHTRERPYECSE 

CGKGFYLEVKLLQHQRIHTREQLCECNECGKVF 

SHQKRLLEHQKVHTGEKPCECSECGKCFRHRTS 

LIQHQKVHSGERPYNCTACEKAFIYKNKLVEHQ 

RIHTGEKPYECGKCGKAFNKRYSLVRHQKVHIT 

EEP 


3613 


A 


817 


3345 


NQSHPDSETVTVEGGRRKMKSNQERSNECLPPK 

BCREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGN 

PGGRGHGGGRHGPAGTSVELGLQQGIGLHKALS 

TGLDYSPPSAPRSVPVATTLPAAYATPQPGTPVSP 

VQYAHLPHTFQFIGSSQYSGTYASFIPSQLIPPTAK 

PVTSAVASAAGATTPSQRSQLEAYSTLLANMGS 

LSQTPGHKAEQQQQQQQQQQQQQQQQQQQQQ 
QQQHQQQQQQQQQQQQQQHLSRAPGLITPGSPP 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


| Predicted end 

nucleotide 
1 location 
I corresponding 

to last amino 
i acid residue of 

peptide 
I sequence 


Amino acid sequence (A«Alanioe OCysteine, D=»Aspartie Acid, j 
E^Glutamic Add, ^Phenylalanine, G^GIydne, H»Histidine, 
I=IsoIeudne, K-Lysine, L=Leucine, M=Methlonlne, 
N^Asparagine, P^ProIine, Q^Glutamine, R a Argiuine, S-Serine, 
T^Threonine, V^Vallne, W^Tryptophan, Y-Tyrosine, 
X=Unknown, *=^top codon, /^possible nucleotide deletion, 
V=possibIe nudeottde insertion 










PAQQNQYVHISSSPQNTGRTASPPAIPVHLHPHQ 

TMIPHTLTLGPPSQWMQYADSGSHFVPREATK 

KAESSRLQQAIQAKEVLNGEMEKSRRYGAPSSA 

DLGLGKAGGKSVPHPYESRHVWHPSPSDYSSR 

DPSGVRASVMVLPNSNTPAADLEVQQATHREAS 

PSTL>HDK5GLHLGKPGHRSYALSPHTVIQTTHSA 

SEPLPVGLPATAFYAGTQPPVIGYLSGQQQAITY 

AGSLPQHLVIPGTQPLLIPVGSTDMEASGAAPAIV 

TSSPQFAAVPHTFVTTALPKSENFNPEALVTQAA 

YPAMVQAQIHLPWQSVASPAAAPPTLPPYFMK 

GSnQLANGELKKVEDLKTEDFIQSAEISNDLKJDS 

STVERIEDSHSPGVAVIQFAVGEHRAQVSVEVLV ] 

EYPFFVFGQGWSSCCPERTSQLFDLPCSKLSVGD 

VCISLTLK^KNGSVKKGQPVDPASVLLKHSKA 

DGLAGSRHRYAEQENGINQGSAQMLSENGELKF 

PEKMGL S AAPFLTKIEPSKPAATRKRRWS APESR 

KLEKSEDEPPLTLPKPSLIPQEVKICIEGRSNVGK j 


3614 


A 


3 


114 


FFESRLRCKCCEPRGSWARFGCWRLQPEFKPKQ 
LEG 


3615 


A 


3 


1603 


DAWALTNQFSDSKQHIEVLKESLTAKEQRAADLQ 

TEVDALRLRJLEEKETMLNKKTKQIQDMAEEKGT 

QAGEIHDLKDMLDVKERKVNVLQKKIENLQEQL 

RDKJEKQMSSLKERVKSLQADTTNTDTALTTLEE 

ALAEKERT1ERLKEQRDRDEREKQEEIDNYKKJDL 

KDLKEKVSLLQGDLSEKEASLLDLKEHASSLASS 

GLKKDSRLKTLEIALEQKXEECLKMESQLKKAH 

EAALEARASPEMSDRIQHLEREITRYKDESSKAQ 

AEVDRLLEILKEVENEKNDKDKKIAELESLTSRQ 

VKDQNKKVANLKHKEQVEKKKSAQMT ,KR ARRR 

EDNLNDSSQQLQDSLRKKDDRIEELEEALRESVQ 

ITAEREMVLAQEESARTNAEKQVEELLMAMEKV 

KQELESMKAKLSSTQQSLAEKETHLTNLRAERR 

KHLEEVLEMKQEALLAAISEKDANIALLELSSSK 

KKTQEEVAALKREKDRLVQQLKQQTQNRMKLM 

ADNYEDDHFKSSHSNQTNHKPSPDQDEEEGIWA 


3616 


A 

• i 


244 


1420 


RRRWRARGGLVPTLAWAEATGAYVPGRDKPDL 

PTWKRNFRSALNRKEGLRLAEDRSKDPHDPHKI 

YEFVNSGVGDFSQPDTSPDTNGGGSTSDTQEDIL 

DELLGNMVLAPLPDPGPPSL A VAPEPCPQPLRJSP S 

LDNPTPFPNLGPSENPLKRLLVPGEEWEFEVTAF 

YRGRQVFQQTISCPEGLRLVG SEVGDRTLPGWP 

VTLPDPGMSLTORGVMSYVRHVLSCLGGGLAL 

WRAGQWLWAQRLGHCHTYWAVSEELLFNSGH 

GPDGEVPKDKEGGVFDLGPFIVGSLGPPDLITFTE 

GSGRSPRYALWFCVGESWPQDQPWTKRLVMVK 

VVPTCLRALVEMARVGGASSLENTVDLHISNSHP 

LSLTSDQYKAYLQDLVEGMDFQGPGES 


3617 


A 


852 


304 


RGGLLSKMAXVLKAAAANAVGLFSRLQAP1PTV 

RASSTSQPLDQVTGSVWNLGRLNHVAIAVPDLE 

KAAAFYKNBLGAQVSEAVPLPEHGVSWFVNLG 

NTKMELLHPLGRDSPIAGFLQKNKAGGMHHICIE 

VDhHNAAVMDLKKKKIRSLSEEVKIGAHGKPVIF 

LHPKDCGGVLVELEQA 


3618 


A 


3 


5992 


DNIDETYGVNVQFESDEEEGDEDVYGEVRFTFAS 
DDDMEGDEAWRCTLSANMYVDEDLVWCASEL 
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1 SEQD) 
NO: 


Method 


I Predicted 
beginning 

1 nucleotide 
location 
corresponding 
to first amino 
acid residue of 

I peptide 

I sequence 


Fredtcted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 


Amino acid sequence (A«AIanine C=- Cysteine, D= As pa rile Acid, 
EKSIutaroic Acid, ^Phenylalanine, G=GIydne, H°Histidine, 
Msolencine, K=Lysine, L=L*ucine, M»Methionine, 
N-Asparagine, P*»Proline, Q=*Glutaraine, R^Arginine, S=Serine, 
T-Threonine, V-Vallne, W»Tryptophan, Y n *T|/rosine l 
X«=Unknown, *=Stop cod on, /"possible nucleotide deletion, 
\-possibIe nucleotide insertion 






1 1 




NIPEFFPLESPHKKVGYGLSSRTWLQGGGKVIEA 

GRDLLVASGELMS SKKKDLHPRDIDAFWLQRQL 

SRFYDDArVSQKKADEVLEILKTASDDRECENQL 

VLIXGFmTDFIKVLRQHRMMILYCTLLASAQSE 

AEKERIMGKMEADPELSKFLYQLHETEKEDLIRE 

ERSRRERVRQSRMDTOLETMDLDQGGEALAPRQ 

VLDLEDL VFTQGSHFMANKRCQLPDG SFRRQRK 

GYEEVHVPALKPKPFGSEEQLLPVEKLPKYAQA 

GFEGFKTLKRIQSKLYRAALETDENLLLCAPTGA 

GKT^ALMCMLREIGKHINMDGTINVDDFKIIYl 

APMRSLVQEMVGSFGKRLATYGITVAELTGDHQ 

LCKEEISATQUVCTPEKWDIITRKGGERTYTQLV 

RLIILDEIHLLHDDRGPVLEALVARAIRNIEMTQE 

DVRLIGLSATLPNYEDVATFLRVDPAKGLFYFDN 

SFRPVPLEQTYVGITEKKAIKRFQIMNEIVYEKIM 

EHAGKNQVLVFVHSRKETGKTARAIRDMCLEKD 

TLGLFLREGSASTEVLRTEAEQCKNLELKDLLPY 

GFAIHHAGMTRVDRTLVEDLFGDKHIQVLVSTA 

TLAWGVNLPAHTVTIKGTQVYSPEKGRWTELGA 

LDILQMLGRAGRPQYDTKGEGILITSHGELQYYL 

SLLNQQLPDBSQMVSKLPDMLNAEIVLGNVQNA 

KDAVNWLGYAYLYIRMLRSPTLYGISHDDLKGD 

PLLDQRRLDLVHTAALMLDKNNLVKYDKKTGN 

FQVTELGRIASHYYITNDTVQTYNQLLKPTLSEIE 

LFRVFSLSSEFKNITVREEEKJLELQKLLERVPIPVK 

ESIEEPSAKINVLLQAFISQLKLEGFALMADMVY 

VTQSAGRLMRAIFEI VLNRG WAQLTDKTLNLCK 

MIDKRMWQSMCPLRQFRKLPEEVVKKIEKKNFP 

FERLYDLNHNEIGELIRMPKMGKTIHKYVHLFPK 

LELSVHLQPITRSTLKVELTITPDFQWDEKVHGSS 

EAFWILVEDVDSEVILHHEYFLLKAKYAQDEHLI 

TFFVPVFEPLPPQYFIRVVSDRWLSCETQLPVSFR 

HLILPEKYPPPTELLDLQPLPVSALKNSAFESLYQ 

DKFPFFNPIQTQVFNTVYNSDDNVFVGAPTGSGK 

TICAEFAILRMLLQNSEGRCVYITPMRLWQEQVY 

MDWYEKFQDRLNKKVVLLTGETSTDLKLLGKG 

NniSTPEKWDILSRRWKQRKNVQNINLFVVDEV 

HLIGGENGPVLE VICSRMRYISSQIERPIRJVALSS S 

LSNAKDVAHWLGCSATSTFNFHPNV^VPLELHI 

QGFNISHTQTRLLSMAKPVFHAITKHSPKKPVIVF 

VPSRKQTRLTAJDILTTCAADIQRQRFLHCTEKDL 

EPYLEKLSDSTLKETLLNGVGYLHEGLSPMERRL 

VEQLFSSGAIQWVASRSLCWGMNVAAHLVIIM 

DTLYYNGKIHAYVDYPIYDVLQNrVGHANRPLQ 

DDEGRCVIMCQGSKKDFFKKFLYEPLPVESHLD 

HCMHDHFNAEIVTKTIENKQDAVX)YLTWTFLYR 

RMTQNPNYYNLQG1SHRHLSDHLSELVEQTLSDL 

EQSKCISIEDEMDVAPLNLGM1AAYYYINYTTIEL 

FSMSLNAKTKVRGLIEnSNAAEYENlPIRHHEDN 

LLRQIJVQKWHKLNNPKFNDPHVKTNGLLLQAHL 

SRMQLSAELQSDTEEELSKAIRLIQACVDVLSSNG 

WI^PAl-AAMELAQMVTQAMWSEDSYLRRLPPF 

PSGLFKRCTDKGVESVFDIMEMEDEERNALLQLT 

DSQIADVARFCNRYPNEELSYEVVDKDSIRSGGP ! 

VWLVQLEREEEVTGPVIAPLFPQKREEGWWW 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E-Glutamic Add, F«PhcnylalanJne t G*«Glycine, H^Histidine, 
I=Isoleutine, K«Lysine, L^Leucine, M=Metbionlne, 
N^Asparagtnc, P=Proline, Q=Glutamine, R^Arglnlne, S=Serine, 
T=»Threonine, V«VaIIne,W»Tryptophan, Y«Tyrosine, 
X=Unknown, *«*Stop codon, /^possible nucleotide deletion, 
^possible nucleotide Insertion 










IGDAKSNSLISDCRJLTLQQKAKVKLDFVAPATGG 
RHNTLYFMSDAYMGCDQEYKFSVDVKEAETDS 
DSD 


3619 


A 


3 


5992 
• 


DNIDETYG VNVQFESDEEEGDEDV YGEVREEA S 

DDDMEGDEAVVRCTLSANMYVDEILVWCASEL 

NIPEFFPLESPHKKVGYGLSSRTWLQGGGKVTEA 

GRDLLVASGELMSSKKKDLHPRDIDAFWLQRQL 

SRFYDDATVSQKKADEVLEILKTASDDRECENQL 

VLLLGFNTFDFIKVLRQHRMMELYCTLLASAQSE 

AEKERIMGKMEADPELSKFLYQLHETEKEDLIRE 

ERSRRERVRQSRMDTDLETMDLDQGGEALAPRQ 

VLDLEDLVFTQGSHFMANKRCQLPDGSFRRQRK 

GYEEVHVPALKPKPFGSEEQLLPVEKLPKYAQA 

GFEGFKTLNRIQSKLYRAALETDENLLLCAPTGA 

GKTOVALMCMIJIEIGKHINMDGTINVDDFKIIYI 

APMRSLVQEMVGSFGKRLATYGITVAELTGDHQ 

LCKEEISATQnVCTPEKWDIITRKGGERTYTQLV 

RLIILDEWLLHDDRGPVLEALVARAIRNIEMTQE 

DVRLIGLSATLPNYEDVATFLRVDPAKGLFYFDN 

SFRPWLEQTYVGITEKKAIKRFQIMNEIVYEKIM 

EHAGKNQVLVFVHSRKETGKTARAIRDMCLEKD 

TLGLFLREGSASTEVLRTEAEQCKNLELKDLLPY 

GFAIHHAGMTRVDRTLVEDLFGDKHIQVLVSTA 

TLAWGVNLPAHTVIIKGTQVYSPEKGRWTELGA 

LDILQMLGRAGRPQYDTKGEGILITSHGELQYYL 

SLLNQQLPIESQMVSKLPDMLNAEIVLGNVQNA 

KDAVNWLGYAYLYIRMLRSPTLYGISHDDLKGD 

PLLDQRRLDLVHTAALMLDKNNLVKYDKKTGN 

FQVTELGRIASHYYTTNDTVQTYNQLLKPTLSEIE 

LFRVFSLSSEFKNITVREEEKLELQKLLERVPIPVK 

ESIEEPSAKINVLLQAFISQLKLEGFALMADMVY ! 

VTQSAGRLMRAIFEIVLNRGWAQLTDKTXNLCK 

MIDKRMWQSMCPLRQFRKLPEEVVKKIEKKNFP 

FERLYDLNHNEIGELIRMPKMGKTIHKYVHLFPK 

LELS VHLQPITRSTLKVELTITPDFQ WDEKVHG SS 

EAFWILVEDVDSEVILHHEYFLLKAKYAQDEHLI 

TFFVPVFEPLPPQYFIRWSDRWLSCETQLPVSFR 

HLBUPEKYPPPTELLDLQPLPVSALRNSAFESLYQ 

DKPPFFNPIQTQVFNTVYNSDDNVFVGAPTGSGK 

TICAEFAILRMLLQNSEGRCVYITPMRLWQEQVY 

MDWYEKFQDRLNKKVVLLTGETSTDLKLLGKG 

MnSTPEKWDILSRRWKQRK>rVQNINLFVVDEV 

HLIGGENGPVLEVICSRMRYISSQDSRPIRJVALSSS 

LSNAKDVAHWLGCSATSTFNFHPNVRPVPLELHI 

QGFhnSHTQTRLLSMAKPVFHAITKHSPKKPVTVF 

WSRKQTRLTAIDILTTCAADIQRQRFLHCTEKDL 

IPYLEKLSDSTLKETLLNGVGYLHEGLSPMERRL 

VEQLFSSGAIQVWASRSLCWGMNVAAHLVIIM 

DTLYYNGK1HAYVDYPIYDVLQMVGHANRPLQ 

DDEGRCVIMCQGSKKDFFKKFLYEPLPVESHLD 

HCMHDHFNAEIVTKTLENKQDAVDYLTWTFLYR 

RMTQNPhTyYNlXJGISHRHLSDHLSELVEQTLSDL 

EQSKCISIEDEMD V APLNLGMLA A Y YYINY 'IT1EL 

FSMSLNAKTKVRGLIEnSNAAEYENTPIRHHEDN 

LLRQI^QKVPHKLNNPKFNDPHVKThTLLLQAHL 
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SEQIO 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
1 peptide 
sequence 


Amino acid sequence (A» Alanine C=Cysteine, D»Aspartic Acid, 
E-Glutaroic Acid, FHPhenylalanine, G=Glycine, H«Histidioe, 
I»IsoIencine, K°Lyslne, L= Leucine, M^Methionioe, 
N^Asparagine, P*=ProIine» Q^Glutamine, R»Arglnine, S^Serine, 
T^Threonlne, V»Valine, W=Tryptophan, Y^Tyrosine, 
X«=Unknown, *=Stop eodon, possible nucleotide deletion, 
v=possiDle nucleotide insertion 










SRMQLSAELQSDTEETT ,SKAIRLIQACVD VLSSNG 

WLSPALAAMELAQMVTQAMWSEDSYLRRLPPF 

PSGLFKRCTDKGVESVFDIMEMEDEERNALLQLT 

DSQIADVARFCNRYPNIELSYEWDKDSIRSGGP 

VVVLVQLEREEEVTGPVIAPLFPQKREEGWWVV 

IGDAKSNSLISIKRLTLQQKAKVKLDFVAPATGG 
RHNTLYFMSDAYMGCDQEYKFSVDVKEAETDS 
DSD 


3620 


A 


1205 

I • 


323 


VIKMALAARLLPQFLHSRSLPCGAVRLRTPAVAE 

VRLPSATLCYFCRCRLGLGAALFPRSARAJLAASA 

LPAQGSRWPVLSSPGLPAAFASFPACPQRSYSTE 

EKPQQHQKTKJVIIVLGFS^INWVRTRIKAFLIWA 

YFDKEFSITEFSEGAKQAFAHVSKLLSQCKFDLL 

EELVAKEVLHALKEKVTSLPDNHKNALAANIDBI 

VFTSTGDISIYYDEKGRKJFVNILMCFVVTLTSANIP 

SETLRGASVFQVKLGNQKVETKQLLSASYEFQR 

EFTQGVKPDWTIARIEHSKLLE 


3621 


A 

1 


2 

8 1 

i 


2995 

• 


SSSRSRHSSISPVRLPLNSSLGAELSRKKKERAAA 

AAAAKMDGKESSYERSGSYSGRSPSPYGRRRSSS 

PFLSKRSLSRSPLPSRKSMKSRSRSPAYSRHSSSH 

SKKKRSSSRSRHSSISPVRLPLNSSLGAELSRKKK 

ERAAAAAAAKMDGKESSYERSGSYSGRSPSPYG 

RRRSSSPFLSKRSLSRSPLPSRKSMKSRSRSPAYS 

RHSSSHSKKKRSSSRSRHSSISPVRJLPLNSSLGAEL 

SRKKKERAAAAAAAKMDGKESKGSPVFLPRKE 

NSSVEAKDSGLESKKLPRSVKLEKSAPDTELVNV 

THLNTE VXNS SDTGKVKLDEN SEKHL VKJDLKA Q 

GTRDSKPlALKJEErvnTPKETCTSEKETPPPLPTlASP 

PPPLPTTTPPPQTPPLPPLPPIPALPQQPPLPPSQPA 

FSQVPASSTSTLPPSTHSKTSAVSSQANSQPPVQV 

SVKTQVSVTAAIPHLKTSTLPPLPLPPLLPGDDDM 

DSPKETLPSKPVKKEKEQRTRHLLTDLPLPPELPG 

GDLSPPDSPEPKAITPPQQPYKKRPKICCPRYGER 

RQTESDWGKRCVDKFDIIGIIGEGTYGQVYKAKD 

KDTGELVALKKVRLDNEKEGFPITAIREIKILRQL 

IHRSVVNMKEIVTDKQDALDFKKDKGAFYLVFE 

YMDHDLMGLLESGLVHFSEDHDCSFMKQLMEGL 

EYCHKJKNFLHRDKCSNILLNNSGQIKLADFGLA 

RLYNSEESRPYTNKVITLWYRPPKLLLGEERYTP 

AIDVWSCGCILGELFTKKPIFQANLELAQLELISR 

LCGSPCPAVWPDVIKLPYFNTMKPKKQYRRRLR 

EEFSFIPSAALDLLDHMLTLDPSKRCTAEQTLQSD 

FLKDVELSKMAPPDLPHWQDCHELWSKKRRRQ 

RQSGVVVEEPPPSKTSRKETTSGTSTEPVKNSSPA 

PPQPAPGKVESGAGDAIGLADITQQLNQSELAVL 

LNLLQSQTDLSIPQMAQLLNIHSNPEMQQQLEAL 

NQSISALTEATSQQQDSETMAPEESLKEAPSAPVI 

LPSAEQTTLEASSTPADMQNILAVLLSQLMKTQE 

PAGSLEENNSDKNSGPQGPRRTPTMPQEEAAGRS 

NGGNAL 


3622 


A 


16 


390 


TPERGSAYPETAAVRRPAGECPITMSDLEAKLST 
EHLGDKIKDEDIKLRVIGQDSSEIHFKVKMTTPLK 
KLKKSYCQRQGVPVNSLRFLFEGQRIADNHTPEE 
LGMEEEDVIEVYQEQ1GGHSTV 


3623 1 


A 1 


2 | 


1544 


PPPAPGPDGLNEGCLHRLSMPHQRPRTCAMNPE 
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SEQID 

NO: 


Method 

* 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
odd residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D-Aspartic Acid, 
^Glutamic Acid, ^Phenylalanine, G=Glycine, BNBistidine, 
I=Isoleucine, K^Lysine, D=Lcucine, M=Methionine, 
N»Asparagine, P=Prallne, Q^GIutamine, RpArginine, S=Serine, 
T«Threonine, V«Valioe, W=Tryptophan, Y=TyrosIne, 
X=Unkno>vn, *=Stop codon, /-possible nucleotide deletion, 
V=possibie nucleotide insertion 










LTMESLGTLHGARGGGSGGGGGGGGGGGGGGP 

GHEQELLASPSPHHARRGPRGSLRGPPPPPTAHQ 

ELGTAAAAAAAASRSAMVTSMASDLDGGDYRPE 

LSIPLHHAMSMSCDSSPPGMGMSNTYTTLTPLQP 

LPPISTVSDKFHHPHPHHHPHHHHHHHHQRLSGN 

VSGSFTLMRDERGLPAMNNLYSPYKEMPGMSQS 

LSPLAATPLGNGLGGLHNAQQSLPNYGPPGHDK 

MLSPNFDAHHTAMLTRGEQHLSRGLGTPPAAM 

MSHLNGLHHPGHTQSHGPVLAPSRERPPSSSSGS 

QVATSGQLEEINTKEVAQRITAELKRYSIPQAIFA 

QRVLCRSQGTLSDLLRNPKPWSKLKSGRETFRR 

MWKWLQEPEFQRMSALRLAACKRJKEQEPNKDR 

NNSQKKSRLWTDLQRRTLFAIFKENKRPSKEMQ 

IHSQQLGLELTTVSNFFMNARRRSLEKWQDDLS 

TGGSSSTSSTCTKA 


3624 


A 


27 


2152 


S ARKAEAATSGTAARDG S VGRNLVPPPSASAPK 

AEVESNEKDNRPEEEEQVIHEDDERPSEKNEFSR 

RKRSKSEDMDNVQSKRRRYMEEEYEAEFQVKTT 

AKGDINQKLQKVIQWLLEEKLCALQCAVFDKTL 

AELKTRVEKJDECNKJmKTVLTELQAKIARLTKRF 

EAAKEDLKKRHEHPPNPPVSPGKTVNDVNSNNN 

MSYRNAGTVRQMLESKRNVSESAPPSFQTPVNT 

VSSTNLVTPPAVVSSQPKLQTPVTSGSLTATSVLP 

APNTATVVATTQVPSGNPQPTISLQPLPVILHVPV 

AVSSQPQLLQSHPGTLVTNQPSGNVEFISVQSPPT 

VSGLTK>nPVSLPSLPNPTKPNNVPSVPSPSIORNP 

TASAAPLGTTLAVQAVPTAHSIVQATRTSLPTVG 

PSGLYSPSTNRGPIQMKIPISAFSTSSAAEQNSNTT 

PRIENQTNKTIDASVSKKAADSTSQCGKATGSDS 

SGVIDLTMDDEESG ASQDPKKLNHTPV STMSSSQ 

PVSRPLQPIQPAPPLQPSGVPTSGPSQTnHLLPTA 

PT I W VTHRP VTQVTTRLP VPRAPANHQ V VYTT 

LPAPPAQAPLRGTVMQAPAVRQVNPQNSVTVRV 

PQTTTYVVNNGLTLGSTGPQLTVHHRPPQVHTEP 

PRPVHPAPLPEAPQPQRLPPEAGSTSRPSEATLEV 

SHAFRVKMAI VL VMECPGGG SKJLCHC 


3625 


A 


210 

• 


1115 


ASPFLRPQGHDSGEREPFSQTPGLMQPFSIPVQIT 

LQGSRRRQGRTAFPASGKKRETDYSDGDPLDVH 

KJRLPSSTGEDRAVMLGFAMMGFSVLMFFLLGTT 

ILKPFMLSIQREESTCTAIHTDIMDDWLDCAFTCG 

VHCHGQGKYPCLQVFVNLSHPGQKALLHYNEE 

AVQINPKCFYTPKCHQDRNDLLNSALDIKEFFDH 

KNGTPFSCFYSPASQSEDVILIKKYDQMAIFHCLF 

WPSLTLLGGALIVGMVRLTQHLSLLCEKYSTVV 

RDEVGGKVPYIEQHQFKLCIMRRSKGRAEKS 


3626 


A 


9 


921 


SSVVEFSALSVSMACLSPSQLQKFQQDGFLVLEG 

FLSAEECVAMQQRIGEIVAEMDVPLHCRTEFSTQ 

EEEQLRAQGSTD YFLS SGDKIRFFFEKGVFDEKG 

NFLVPPEKSINKIGHALHAHDPVFKSITHSFKVQT 

LARSLGLQMPWVQSMYIFKQPHFGGEVSPHQD 

ASFLYTEPLGRVLGVWIAVEDATLENGCLWFEPG 

SHTSGVSRRMVRAPVGSAPGTSFLGSEPARDNSL 

FVPTPVQRGALVLIHGEVVHKSKQNLSDRSRQA 

YTFHLMEASGTTWSPENWLQPTAELPFPQLYT 


3627 


A 


231 


644 


mSSPRTGRDHQELNLHTERDSRSQRAVLKIPRQ 
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SEQ ED 
NO: 



3628 



3629 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



699 



810 



1604 



Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E^GIutamie Add, ^Phenylalanine, G»Grycine, H=Histidine, 
I»Isoleuctne, KHLysine, LHLeudne, M^Methlonine, 
N°>Asparagtne, P^Proline, Q=Glotamine, R^Arglnine, S-Serinc, 
T^Threontne, V«Valine, W=Tryptophan, Y^Tyroslnc, 
Xsfjnknown, *=Stop codon, /^possible nudeotide deletion, 
\=possible nudeotide insertion 



NPGJCFYWIFLPSRSHSASHGSRQRQVSCQGTQDEI 
LKMRNTFAELKNSLEALSSRMDQAEERIGTQAG 
VQWRDHGSLQPQPPEFKQCFHLSLPSSWDYRAC 
LS 



GCKHLLQNSWYDPRVREADRVGQRARRPRAAM 

DWLMGKSKAKPNGKKPAAEERKAYLEPEHTKA 

RITDFQFKELVVLPREIDLNEWLASNTTTFFHHIN 

LQYSTISEFCTGETCQTMAVCNTQYYWYDERGK 

KVKCTAPQYVDFVMSSVQKLVTDEDVFPTKYG 

REFPSSFESLVRKICRHLFHVLAHIYWAHFKETLA 

LELHGHLNTLYVHFILFAREFNLLDPKETAIMDD 

LTEVLCSGGRRGSTVGAVGMGPAAGAPGAQNH 

VKER 



CSHGSSAVSAWSPLFQASEVERQLSMQVHALRE 

DFREKNSSTNQHIIRLESLQAEIKMLSDRKRELEH 

RLSATLEENDLLQGTVEELQDRVL1LERQGHDKD 

LQLHQSQLELQBVRLSCRQLQVKVEELTEERSLQ 

SSAATSTSLLSEDBQSMEAEELEQEREQLTLLSVE 

MTALKEERDRLRVTSEDKEPKEQLQKAIRDRDE 

AIAKKNAVELELAKCRMDMMSLNSQLLDAIQQ 

KLNLSQQLEAWQDDMHRVIDRQLMDTHLKERS 

QPAAALCRGHSAGRGDEPSIAEGKRLFSFFRKI 



3630 



423 



PAKVLTLDIYLSKTEGAQVDEPVVITPRAEDCGD 

WDDMEKRSSGRRSGRRRGSQKSTDSPGADAELP 

ESAARDDAVFDDEVAPNAASDNASAEKKVKSPR 

AALDGGVASAASPESKPSPGTKGQLRGESDRSK 

QPPPASSP 



3631 



2082 



674 



3632 



942 



WSGFWQLPGVRGVGSAPGGDGAEFTSRRGSSRR 

PGAACPGCRGAGSERAPGGMGRRRAPELYRAPF 

PLYALQVDPSTGLLIAAGGGGAAKTGIKNGVHF 

LQLELINGRLSASLLHSHDTETRATMNLALAGDI 

LAAGQDAHCQLLRFQAHQQQGNKAEKAGSKEQ 

GPRQRKGAAPAEKKCGAETQHEGLELRVENLQA 

VQTDFSSDPLQKWCFNHDNTLLATGGTDGYVR 

VWKVPSLEKVLEFKAHEGEIEDLALGPDGKLVT 

VGRDLKASVWQKDQLVTQLHWQENGPTFSSTP 

YRYQACRFGQVPDQPAGLRLFTVQIPHKRLRQPP 

PCYLTAWDGSNFLPLRTKSCGHEVVSCLDVSES 

GTFLGLGTVTGSVAIYIAFSLQCLYYVREAHGIV 

VTDVAFLPEKGRGPELLGSHETALFSVAVDSRCQ 

LHLLPSRRSVPVWLLLLLCVGLIIVTILLLQSAFPG 

FL 



40 



PWCQRVEVRSCGSSKRSCSRWSGSSWDGSRSLG 
RGLNHTSLNRSPPFTPDTMTHCCSPCCQPTCCRT 
TCCRTTCWKPTTVTTCSSTPCCQPSCCVPSCCQP 
CCHPTCCQNTCCRTTCCQPTCVASCCQPSCCSTP 
CCQPTCCGSSCCGQTSCGSSCCQPICGSSCCQPCC 
HPTCY QTICFRTTCCQPTCCQPTCCRNTSCQPTCC 
GSSCCQPCCHPTCCQTICRSTCCQPSCVTRCCSTP 
CCQPTCGGSSCCSQTCNESSYCLPCCRPTCCQTT 
CYRTTCCRPSCCCSPCCVSSCCQPSCC 



3633 



605 



3004 



1 



GPEG YRGRRARHPSLG S TTGHCGGGRG AEGTGT 
DPAAPAARLNVDGLLVYFPYDYIYPEQFSYMRE 
LKRTLDAKGHGVLEMPSGTGKTVSLLALIMAYQ 
RAYPLEVTKLIYCSRTVPEIEKVIEELRKLLNFYE 
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SKQID | Method 
NO: 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A»AlanIne OCysteine, D^Aspartic Acid, 
E=Glutamlc Acid, ^Phenylalanine, G^GIycine, ENHistidine, 
I-Isoleucine, K^Lysine, L=Lcucine, {^Methionine, 
N-Asparagine, P«*Proline, Q=Glutaraine, R-Arginlne, S^Serine, 
T»ThreonJne, V=Valine, W**TryptophaD, Y«Tyrosinc, 
X=Un known, *°Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 



3634 



159 



384 



3635 



409 



3636 



3637 



3638 



48 



282 



1248 



KQEGEKLPFLGLALSSRKNLCIHPEVTPLRFGKD 

VDGKCHSLTA SYVRAQYQHDTSLPHCRFYEEFD 

AHGREWLPAGIYNLDDLKALGRRQGWCPYFLA 

RYSILHANVVVYSYHYLLDPKLADLVSKELARK 

AVWTOEAHNIDNVCIDSMSVNLTRRTLDRCQG 

NLETLQKTNHJR1KETDEQRLRDEYRRLVEGLREA 

SAARETDAHLAKPVLPDEVLQEAVPGSIRTAEHF 

LGFLRRLLEYVKWRLRVQHVVQESPPAFLSGLA 

QRVCIQRKPLRFCAERLRSIJLJrrLEITDLADFSPL 

TLLANFATLVSTYAKGFTIIIEPFDDRTPTIANPIL 

HFSCMDASLAIKPVFERFQSVIITSGTLSPLDIYPK 

ILDFHPVTMATFTMTLARVCLCPMIIGRGNDQVA 

ISSKFETREDIAVTKNYGKLLLEMSAVVPDGIVAF 

FTSYQYMESTVASWYEQGILENIQRNKLLFIETQ 

DGAETSVALEKYQEACENGRGAELLSVARGKVS 

EGIDFVHHYGRAVIMFGWYVYTQSRILKARLEY 

LRIXJFQIRENDFLTFX)AMRHAAQCVGRAIRGKT 

DYGLMVFADKRFARGDKRGKLPRWIQEHLTDA 

NLNLTVDEGVQVAKYFLRQMAQPFHREDQLGL 

SLLSLEQLESEETLKRJEQIAQQL 



LKMSSKTASTNN1AQARRTVQQLRLEASIERIKV 
SKASADLMSYCEEHARSDPLLIGIPTSENPFKDKK 
TC1IL ' 



TELSQLEKAHPPADMGRRKSKRKPPPKKKMTGT 

IJETQFTCPFC3>THEKSCDVKMDRARNTGVISCTV 

CLEEFQTPITCILGNLGFFQRVGRGLESGPCSSGP 

LCALVQGQSRPEEQVPPSDFCGVRRCRAGFQCQ 

DHLK5CYQDSHEDPTKMKRFLFLLLTISLLVMVQ 

IQTGLSGQNDTSQTSSPS ASSSMSG GIFLFFVANAI 

IHLFCFS 



3639 



11 



630 



1200 



ARAGSVVGSAAARGPPAGCRCERAARLPSSPAR 

RRRCDWVEDGAGRMEILMTVSKFASICTMGAN 

ASAIJEKEIGPEQFPVNEHYFGLVNFGNTCYCNSV 

LQALYFCRPFREKGLAYKSQPRKKESLLTCLADL 

FHSIATQKJCKVGVIPPKKFITRLRKENELFDNYM 

QQDAHEFLNYLLNTIADILQEERKQEKQNGRLPN 

GNIDNEh^STPDPTWVHEIFQGTLTNETRCLTC 

ETISSKDEDFLDLSVDVEQNTSITHCLRGFSNTET 

LCSEYKYYCEECRSKQEAHKRMKVKKLPMILAL 

HLKRFKYMDQLHRYTKLSYRVVFPLELRLFNTS 

GDATNPDRMYDL VA VWHCGS GPNRGHYIATV 

KSHDFWLLFDDDIVEKIDAQAIEEFYGLTSDISKN 

SES GYILFYQSRD 



PAGIPVST1SSDRRASTDLTOKMKPDETPMFDPNL 

LKEVDWSQNTATFSPAISPTHPGEGLVLRPLCTA 

DLNRGFFKVLGQLTETGVVSPEQFMKSFEHMKK 

SGDYYVTVVEDVTLGQIVATATLIIEHKFIHSCAK 

RGRVEDVVVSDECRGKQLGNLLLSTLTLLSKKL 

NCYKTIXECLPQNVGFYKKFGYTVSEENYMCRR 

FLK 



PRVRLLRPSRSRSCRGLLSTRAPGPSPFRSLHSSPL 

LPHAMKSPFYRCQNTTSVEKGNSAVMGGVLFST 

GLLGNLLALGLLARSGLGWCSRRPLRPLPSVFY 

MLVCGLTVTDLLGKCLLSPVVLAAYAQNRSLRV 

LAPALDNSLCQAFAFFMSFFGLSSTLQLLAMALE 



403 
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NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acju rcsiuue oi 
peptide 
sequence 


Amino acid sequence (A^AJanine OCysteine, D°Aspartic Add, 
E=Gtutamlc Acid, ^Phenylalanine, G=Gryclne, H-Histidtae, 
I=Isoleudne, K=Lysine, L=Leudne, M=Methionine, 
N=Asparagine, P»Proline, Q=Glutamlne, R=Argininc, S^Serine, 
T~Tbreonine, V»VaIine, W^fryptopbao, Y=Tyroslne, 
A^unitiiown, w ^otop coaon, /==possiDic nocieonae aeieoon, 
^possible nudeotide insertion 










CWl^LGHPFFYRRHITLRLGALVAPVVSAFSLAF 

CALPFMGFGKFVQYCPGTWCFIQMVHEEGSLSV 

LGYSVLYSSLMALLVI^TVLCNLGAMRNLYAM 

HRRLQRHPRSCTRDCAEPRADGREASPQPLEELD 

HLLLLALMTVLFTMCSLPVIYRAYYGAFKDVKE 

KNRTSEEAEDOtALRFLSVISrVDPWTFnFRSPVFR 

IFFHKIFIRPLRYRSRCSNSTOMESSL 


3640 


A 


930 


182 


PLPPPTLAMFLTRSEYDRGVNTFSPEGRLFQVEY 

AIEAHCLGSTAIGIQTSEGVCLAVEKRTTSPLMEPS 

SffiKIVEIDAHIGCAMSGLIADAKTLIDKARVETQ 

NHWFTYNETN1TVESVTQAVSNLALQFGEEDADP 

GAMSRPFGVALIJFGGVDEKGPQLFrlMDPSGTFV 

QCDARAIGSASEGAQSSLQEVYHKSMTLKEAIKS 

SLIILKQVMEEKLNATNIELATVQPGQNFHMFTK 

EELEEVIKDI 


3641 


A 


2 


1254 


PTGQGGKRAEARSCLLSKAMLGRSGYRALPLGD 

FDRFQQSSFGFLGSQKGCLSPERGGVGTGADVPQ 

S WPSCLCHGLISFLGFLLLLV1 FPISG WFALKI VPT 

YERMIVFRLGRIRTPQGPGMVLLLPFIDSFQRVDL 

RTRAFNVPPCKLASKDGAVLSVGADVQFRTWDP 

VLSVMTVKDLNTATRMTAQNAMTKALLKRPLR 

EIQMEKLKISDQLLLEINDVTRAWGLEVDRVELA 

VEAVLQPPQDSPAGPNLDSTLQQLALHFLGGSM 

NSMAGGAPSPGPADTVEMVSEVEPPAPQVGARS 

SPKQPLAEGLLTAJLQPFLSEALVSQVGACY QFN V 

VLPSGTQSAYFLDLTTGRGRVGHGVPDGIPDVV 

VEMAEADLRALLCRELRPLGAYMSGRLKVKGD 

LAMAMKLEAVLRALK 


3642 


A 


1 


237 


RRGEIDMATEGDVELELETETSGPERPPEKPRKH 

DSGAADLERVTDYAEEKEIQSSNLETAMSVIGDR 

RSREQKAKQER 


3643 


A 


94 


541 


RKERRRRRRRMEAWFVFSLLDCCALIFLSVYFU 

TLSDLECDYTNARSCCSKLNKWVTPELIGHTTVTV 

LLLMSLHWFIFIXNLPVATWNIYRYIMVPSGNM 

GVFDPTEIHNRGQLKSHMKEAM1KLGFHLLCFF 

MYLYSMILAUND 


3644 


A 


95 


2808 

• 


TSCRHFPITSEDPLNYLLILTVERIYAYQALPLGFL 

FCSRDPVPEYLNHCGVKYVLISDRASFCALHIFFS 

PFRNVFRPAAGGGIAPPPRLWFQPSLSDAEME1PK 

LLPARGTLQGGGGGGIPAGGGRVHRGPDSPAGQ 

VPTRRLLLPRGPQDGGPGRRREEASTASRGPGPS 

LFAPRPHQPSGGGGGGGDDFFLVLLDPVGGDVE 

TAGSGQAAGPVLREEAEEGPGLQGGESGANPAG 

PTALGPRCLSAVP'IPAPISAPGPAAAFAGTVTIHN 

QDLLLRFENGVLTLATPPPHAWEPGAAPAQQPG 

CLIAPQAGFPHAAHPGDCPELPPDLLLAEPAEPAP 

APAPEEEAEGPAAALGPRGPLGSGPGWLYLCPE 

ALCGQTFAKKHQLKMHLLTHSSSQGQRPFKCPL 

GGCGWTFTTSYKLKRHLQSHDKLRPFGCPAEGC 

GKSFTTVYNLKAHMKGHEQENSFKCEVCEESFP 

TQAKLGAHQRSHFEPERP YQCAFSGCKK 1 Fll VS 

ALFSHNRAHFREQELFSCSFPGCSKQYDKACRLK 

IHUlSHTGERPFLCDroGCGWNFrSMSKLLRHKR 

KHDDDRRFMCPVEGCGKSFTRAEHLKGHSITHL 

STKPFVCPVAGCCARFSARSSLYIHSKKHLQDVD 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A»Alaoine C=Cysteine, D^Aspartie Acid, 
E»GIutamic Acid, ^Phenylalanine, G^Glyrine, H=Histidine, 
I=Isoleurine, K»Lysine, L»Leudne, M=Methionine, 
N=Asparngine, P=Proline, Q^GIutamine, R^Arginine, S^Scriuc, 
T=Threonine, V»Vallne, W«Tryptophan, Y«Tyrosine, 
X«Unkno\vn, *=Stop codon, A=possible nucleotide deletion, 
V=possible nucleotide insertion 










TWKSRCPISSCNiaFTSKHSMKTHMVKRHKVGQ 

DLLAQLEAANSLTPSSELTSQRQNDLSDAEIVSLF 

SDVPDSTSAALLDTALVNSGILTIDVASVSSTLAG 

HLPANNNNSVGQAVDPPSLMATSDPPQSLDTSLF 

FGTAATGFQQSSLNMDEVSSVSVGPLGSLDSLA 

MKNSSPEPQALTPSSKLTVDTDTLTPSSTLCENSV 

SELLTPAKAEWSVHPNSDFFGQEGETQFGFPNAA 

GNHGSQKERNLITVTGSSFLV 


3645 


A 


2194 


1707 


TVSFHKTMASLKCSTVVCVICLEKPKYRCPACRV 

PYCSWCFRKHKEQCNPETRPVEKKIRSALPTKT 

VKPVENKDDDDSIADFLNSDEEEDRVSLQNLKN 

LGESATLRSLLLNPHLRQ1>IVNLDQGEDKAKLM 

RAYMQEPLFVEFADCCLGIVEPSQNEES 


3646 


A 


85 


1948 


ERGGGKAAAAAAAAAAARALAASGQDPRPHPR 

APPWDDSGDDDEATTPADKSELHHTLKNLSLKL 

DDLSTCNDLIAKHGAALQRSLTELDGLKIPSESG 

EKLKWNERATLFRITSNAMINACRDFLELAEIHS 

RKWQRALQYEQEQRVHLEETIEQLAKQHNSLER 

AFHSAPGRPANPSKSFIEGSLLTPKGEDSEEDEDT 

EYFDAMEDSTSFITVITEAKEDSRKAEGSTGTSSA 

DWSSADma,DGASLVPKGSSKVKRRVRIPNKPN 

YSLNLWSIMKNCIGRELSRIPMPVNFNEPLSMLO 

RLTEDLEYHHLLDKAVHCTSSVEQMCLVAAFSV 

SSYSTTVHR1AKPFNPMLGETFELDRLDDMGLRS 

LCEQVSHHPPSAAHYVFSKHGWSLWQEITISSKF 

RGKYISIMPLGAIHLEFQASGNHYVWRKSTSTVH 

NnVGKLWIDQSGDmrVNHKTNDRCQLKFLPYSY 

FSKEAARKVTGWSDSQOKAHYVLSGSWDEQM 

ECSKVMHSSPSSPSSDGKQKTVYQTLSAKLLWK 

KYPLPENAENMYYFSELALTLNEHEEGVAPTDS 

RLRPDQRLMEKGRWDEANTEKQRLEEKQRLSR 

RRRLEACGPGSSCSSEE 


3647 


A 


46 


5007 


PTGDACVSTSCELASALSHLDASHLTENLPKAAS 

ELGQQPMTELDSSSDLISSPGKKGAAHPDPSKTS 

VDTG QVSRPENPSQPASPRVTKCKARSPVRLPHE 

GSPSPGEKAAAPPDYSKTRSASETSTPHNTRRVA 

ALRGAGPGAEGMTPAGAVLPGDPLTSQEQRQGA 

PGNHSKALEMTGMAPESSQEPSLLEGADSVSSR 

APQASLSMLPSTONTKEACGHVSGHCCPGGSRE 

SP VTDIDSFIKELDAS AARSPS SQTGDSGSQEGS A 

QGHPPAGAGGGSSCRAEPVPGGQTSSPRRAWAA 

GAPAYPQWASQPSVLDSIOTDKHFTVNKNFLSN 

YSRNFSSFHEDSTSLSGLGDSTEPSLSSMYGDAE 

DSSSDPESLTEAPRASARDGWSPPRSRVSLHKED 

PSESEEEQEEICSTRGCPNPPSSPAHLPTQAAICPAS 

AKVLSLKYSTPRESVASPREKVACLPGSYTSGPD 

SSQPSSLLEMSSQEHETHADISTSQNHRPSCAEET 

TEVTSASSAMENSPLSKVARHFHSPPIILS SPNMV 

NGLEHDLLDDETLNQYETSINAAASLSSFSVDVP 

KNGESVLEN1JUSESQDLDDLLQKPKMIARRPIM 

AWFKEINKHNQGTHLRSKTEKEQPLMPARSPDS 

KIQMVSSSQKKGVTVPHSPPQPKTNLENKDLSKK 

SPAEMLLTNGQKAKCGPKLKRLSLKGKAKVNSE 

APAANAVKAGGTOHRKPLISPQTSHKTLSKAVS 

QRLHVADHEDPDRNTTAAPRSPQ C VLESKPPL AT 
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seq n> 

NO: 


Merhod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCystcine, D-Asportic Acid, 
&=Glutamic Add, ^Phenylalanine, G=Glyeine, H-Histidinc, 
Msoleurine, K«Lysine, L^Leucine, M-Methionlne, 
N=Asparagine, P=-Proline, Q*Glutamine, R-Arginine, S=Serine, 
T»Thrconine, V~Valine, W«Tryptophan, Y=Tyrosine, 
X=TJnknown, *^Stop codon, /^possible nucleotide deletion, 
Vspossiblc nucleotide insertion 










SGPLKPSVSDTSIRTFVSPLTSPKPVPEQGMWSRF 

HMA VLS EPDRGCPTTPKS PKCRAEGRAPRAD SG 

PVSPAASRNGMSVAGNRQSEPRLASHVAADTAQ 

PRPTGEKGGNIMASDRLERTNQLKIVEISAEAVSE 

TVCGNKPAESDRRGGCLAQGNCQEKSEIRJLYRQ 

VAESSTSHPSSLPSHASQAEQEMSRSFSMAKLAS 

SSSSLQTAERKAEYSQGKS SLMSDSRGVPRNSIPG 

GPSGEDHLYFTPRPATRTYSMPAQFSSHFGREGH 

PPHSLGRSRDSQVPVTSSWPEAKASRGGLPSLA 

NGQGIYSVKPLLDTSRNLPATDEGDIISVQETSCL 

VTDKIKVTRRHYCYEQNWPHESTSFFSVKQRIKS 

FENLANADRPVAKSGASPFLSVSSKPPIGRRSSGS 

IVSGSLGHPGDAAARLLRRSLSSCSENQSEAGTL 

LPQMAKSPSIMTLTISRQNPPETSSKGSDSELKKS 

LGPLGIPTPTMTLASP\OCRl^SSVRHTQPSPVSRS 

KLQELRALSMPDLDKLCSEDYSAGPSAVLFKTEL 

EITPRRSPGPPAGGVSCPEKGGNRACPGGSGPKT 

SAAETPSSASDTGEAAQDLPFRRSWSVNLDQLLV 

SAGDQQRLQSVLSSVGSKSTILTLIQEAKAQSENE 

E6VCFIVLNRKEGSGLGFSVAGGTDVEPKSITVH 

RVFSQGAASQEGTMNRGDFLLSVNGASLAGLAH 

GNVLKVLHQAQLHKDALWIKKGMDQPRPSAR 

QEPPTANGKGLLSRKTIPLEPGIGRSVAVHDALC 

VEVLKTSAGLGLSLDGGKSSVTGDGPLVIKRVY 

KGGAAEQAGIIEAGDEILAINGKPLVGLMHFDA 

WmrfKSVPEGPVQLLIRKPOWSS 


3648 


A' 

• 


337 


1564 


KSRLSVTLMPVQLSEHPEWNESMHSLRISVGGLP 

VLASMTKAADPRFRPRWKV VL IFF VGAAILWLL 

CSHRPAPGRPPTHNAHNWRLGQAPANWYNDTY 

PLSPPQRTPAGIRYRIAVIADLDTESRAQEENTWF 

TYLKKG YLTFSDS GDKV A VE WDKDHG VLE SHL 

AEKGRGMELSDLIVFNGKLYSVDDRTGVVYQIE 

GSKAWWVILSIXjIXjTVEKGFKAEWLAVKDER 

LYVGGLGKEWTTTTGDVVNENPEWVKWGYK 

GSVDHENWVSNYNALRAAAGIQPPGYLIHESAC 

WSDTLQRWFFLPRRASQERYSEKDDERKGANLL 

I^ASPDFGDIAVSrWGAVVPTHGFSSFKFIPNTDD 

QnVALKSEEDSGRVASYIMAFTLDGRFLLPETKI 

GSVKYEGDEFI 


3649 


A 


1 


775 


PTRPGSGSAGGARVGSGEFGVEMAALAPLPPLPA 

QFKSIQHHLRTAQEHDKRDPWAYYCRLYAMQ 

TGMKIDSKTPECRKFLSKLMDQLEALKKQLGDN 

EAITQEIVGCAHLENYALKMFLYADNBDRAGRF 

HKNMIKSFyTASlXroVITVFGELTDENVKHRKY 

ARWKATYIHNCLKNGETPQAGPVGIEEDNDIEEN 

EDAGAASLPTQPTQPSSSSTYDPSNMPSGNYTGI 

QIPPGAHAPANTPAEVPHSTGVAK 


3650 


A 


20 


963 


KMAATLGPLGSWQQWRRCLSARDGSRRLLLLL 

LLGSGQGPQQVGAGQTFEYLKREHSLSKPYQGE 

APRPCFLRDWELQVHFKIHGQGKKNLHGDGLAI 

WYTKDRMQPGPVFGNMDKFVGLGVFVDTYPNE 

EKQQERVFPY1SAMVNNGSLSYDHERDGRPTEL 

GGCTAIVRh^HYDTFLVIRYVKRHLTIMMDIDGK 

HEWRDCIEVPGVRLPRGYYFGTSSITGDLSDNHD 

VISLKLFELTVERTPEEEKIJiRDVFIJPSVDNMKL 
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1 SEQ n> 
NO: 


f Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
loco Hon 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=*Alanlne OCysteine, D=Aspartic Acid, j 
HXSlutamic Acid, ^Phenylalanine, G=Glyclne, H»Histidine, 
I»IsoIeudne, K s Lysine, LHLeucine, M-^Met bio nine, 
N=»Asparogine, P=Proline, Q=Glutamtne, R»Arglnlne, S^erine, 
"^Threonine, V»Vallne, W»Tryptophan, Y»Tyrosine, 
X«Unknown, *«=3top codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 










PEMTAPLPPLSGLALFLIVFFSLVFSVFAIV1GIBLY 
NKWQEQSRKRFY 


3651 


A 


1 


1218 


RS WAYVKKCKNNMCPNRGLHDG PEPC WLHHA 

AGTVSAVQARGLQPSQSRSRPRVPGLATALAYG 

PAHTPPLSRIGWAMQPPPPGPLGDCLRDWEDLQ 

QDFQNIQVSAAADAGSPPSRVSLAQGQGSGSPGC 

KPSLPAEAEGAAQELENQMKERQGLFFDMEAYL 

PKKNGLYLSLVLGNVNVTLLSKQAKFAYKDEYE 

KFKLYLTIILILIS FTCRFLLNSR VTD AAFNFLL VW 

YYCTLT1RESILINNGSRIKGWWVFHHYVSTFLSG 

VMLTWPDGLMYQKFRNQFLSFSMYQSFVQFLQ 

YYYQSGCLYRLRALGERHTMDLTVEGFQSWMW 

RVLTFLLPFLFFGHFWQLFNALTLFNLAQDPQCK 

E WQVLMCGFPFI J ,T ,FLGNFFTTLRWHHKFHSQ 

RHGSKKD 


3652 


A 


640 

■ 


164 


VTTSCnPFAFGLGVRASERLAEIDMPYLLKYQPM 

MQTIGQKYCMDPAV1AGVLSRKSPGDKILVNMG 

DRTSMVQDPGSQAPTSWISESQVFQTTEVLTTRI 

TELQRRFPTWTPDQYLRGGLCAYSGGAGYVRSS 

QDLSCDFCNDVLARAKYLKRHGF 


3653 


A 


2 


909 


IVRRDWQEVSDEKLAMANCKMTKSIRFPALEHC 

YTGGEWLPKDQEEWKRRTGLLLYENYGQSETG 

LICATYWGMKIKPGFMGKATPPYDVQFHMEASV 

ENCnVSMNTADPGSQGITHSLLLQVIDDKGSILPP 

NTEGNIGDRJKPVRPVSLFMCYEGDPEKTAKVEC 

GDFYNTGDRGKMDEEG YICFLG RSDDIINASG YR 

IGPAEVESALVEHPAVAESAVVGSPDPIRGEWK 

AFIVLTPQFLSHDKDQLTKELQQHVKSVTAPYKY 

PRKVEFVSELPKTITGKJERKELRKKETGQM 


3654 


A 


2 


909 


IVRRDWQEVSDIHLAMANCKMTKSIRFPALEHC 

YTGGEWLPKDQEEWKRRTGLLLYENYGQSETG 

LICATYWGMKIKPGFMGKA TPPYD VQFHMEAS V 

ENCnVSMNTADPGSQGITHSLLLQVIDDKGSILPP 

NTEGN1GIRIKPVRPVSLFMCYEGDPEKTAKVEC 

GDFYNTGDRGKMDEEGYICFLGRSDDIINASGYR 

IGPAEVESALVEHPAVAESAWGSPDPIRGEWK 

AFIVLTPQFLSHDKDQLTKELQQHVKSVTAPYKY 

PRKVEFVSELPKTITGKIERKELRKKETGQM 


3655 


A 


2 


2364 


SPGPSLPESAESLDGSQEDKPRGSCAEPTFTDTG 

MVAHINNSRLKAKGVGQHDNAQNFGNQSFEEL 

RAACLRKGELFEDPLFPAEPSSLGFKDLGPNSKN 

VQMSWQRPKDnNNPUTMDGISPTDICQGILGDC 

WLLAAIGSLTTCPKLLYRVVPRGQSFKKNYAGIF 

HFQIWQFGQWVNVVVDDRLPTKNDKLVFVHST 

ERSEFWSALLEKAYAKLSGSYEALSGGSTMEGL 

EDFTG G V AQSFQLQRPPQNLLRJLLRKA VERS SL 

MGCSIEVTSDSELESMTDKMLVRGHAYSVTGLQ 

DVHYRGKMETLIRVRNPWGRIEWNGAWSDSAR 

EWEEVASDIQMQLLHKTEDGEFWMSYQDFLNN 

FTLIXICNLTPDTLSGDYKSYWHTTFYEGSWRTG 

SSAGGCRNHPGTFWTOPQFKISLPEGDDPEDDAE 

GNVVVCTCLVALMQKNWRHARQQGAQLQTIGF 

VLYAVPKEFQNIQDVHLKKEFFTKYQDHGFSEIF 

TNSREVSSQLRLPPGEYIOPSTFEPHRDADFLLRV 

FTEKHSESWELDEVNYAEQLQEEKVSEDDMDQ 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
seouence 


Amino add sequence (A=Alanine OCystelne, D*Aspartic Acid, 
eXSlutamic Acid, ^Phenylalanine, (^Glycine, H^Histidine, 
I=Isoleucine, K-Lyslne, LHLeudne, M=MethlonIne, 
N»Asparagine, P^Proline, Q=*Glotamine, R=Arginine t S=S trine, 
TeThreonine, V«Valine, W=Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
possible nucleotide insertion 

« 










DFLHLFKJVAGEGKEIGVYELQRLLNRMAIKFKS 

FKTKGFGLDACRCMINLMDKDGSGKLGLLEFKI 

LWKKLKKWMDIFIU3CDQDHSGTLNSYEMRLVEE 

KAGIKLNNKVMQVLVARYADDDLUDFDSFISCF 

LRLKTMFTTFLTMDPKNTGHICLSLEQVLGEGW 

EGICRIAPACPSTPPPPSSDVPGPASCPRLFPPWDL 

LPVSTVAADDHVGIEAL 


3656 


A 


3 


174 


PLCTHYLLPELPEKSSRTSPRSRPGNMLSGDPHLP 
QPLCHCLDHCPCCFSGKRLVA 


3657 


A 


1 


444 

• 


DTRSTYHNAHSLPTYVKSPAPCQMTYIKSPAPCQ 

TQTCYVQGASPCQSYYVQAPASGSTSQYCVTDP 

CSAPCSTSYCCLAPRTFGVSPLRRWIQRPQNCNT 

GSSGCCENSGSSGCCGSGGCGCSCGCGSSGCCCL 

GBPMKSRSPALL 


3658 


A 


92 


1537 


SEAPVQPQPYTMTSFYSTSSCPLGCTMAPGARNV 

F V SPID VG CQP V AE AN AA SMCLL ANV AHANRVR 

VGSTPLGRPSLCLPPTSHTACPLPGTCHIPGN1GIC 

GAYGK>m>NGHEKETMKFLNDRLANYLEKVRQ 

LEQENAELETTLLERSKCHESTVCPDYQSYFRTIE 

ELQQKILCSKAENARLIVQIDNAICLAADDFRIKL 

ESERSLHQLVEADKCGTQKLLDDATLAKADLEA 

QQESLKEEQLSLKSNHEQEVKILRSQLGEKFRIEL 

DIEPTIDLNRVLGEMRAQYEAMVETNHQDVEQ 

WFQAQSEGISLQAMSCSEELQCCQSEILELRCTV 

NALEVERQAQHTLKDCLQNSLCEAEDRYGTELA 

QMQSLISNLEEQLSEIRADLERQNQEYQVLLDVK 

ARLENEIATYRNLTPLQSLFHACLLYFLSKLWPC 

HRWVSLWPWSQHGEMDLKARVRRLRLVALGSG 

VPSPCPVFLQD 


3659 


A 


2 


402 


DLLQCLNQLYSASTEMSCQQSQQQCQPPPKCTP 
KCPPKCTPKCPPKCPPKCPPQYSAPCPPPVSSCCG 
SSSGGCCSSEGGGCCLSHHRPRQSLRRRPQSSSC 
CGSGSGQQSGGSSCCHSSGGSGCCHSSGGCC 


3660 


A 


26 


710 


CSAVEVKMAARTAFGAVCRRLWQGLGNFSVNT 

SKGNTAKNGGIJLLSTNMKWVQFSNLHVDVPKD 

LTKPVVTISDEPDILYKRLSVLVKGHDKAVLDSY 

EYFAVLAAKELG1SBKVHEPPRKIERFTLLQSVHI 

YKKHRVQYEMRTLYRCLELEHLTGSTADVYLEY 

IQRNLPEGVAMEVTKFCFFIFJLDTIRTVTRTHQGA 

NLGNTIRRKRRKQVIKPQGGHFCLNLK 


3661 


A 


2 


370 


DVSVAASEPTVYRNPTKMSCQQNQQQCQPPPKC 
PIPKYPPKCPSKCASSCPPPISSCCGSSSGGCCSSG 
GCGCCSSEGGGCCLSHHRHHRSHCHRPKSSNCY 
GSGSGQQSGGSGCCSGGGCC 


3662 


A 


205 


1277 


RKSLPHPNPQKMLKKPLSAVTWLCIFIVAFVSHP 

AWLQKLSKHKTPAQPQLKAANCCEEVKELKAQ 

VANLSSLLSELKKKQERDWVSWMQVMELESN 

SKRMESRLTDAESKYSEMNNQDDIMQLQAAQTV 

TQTSAGKETSPLRERGVPPHLQHCFYIPPDDFLGS 

PELEVFCDMETSGGGWIHQRRKSGLVSFYRDW 

KQYKQGFG SIRGDFWLGNEHIHRLSRQPTRLRVE 

MEDWEGNLRYAEYSHFVLGNELNSYRLFLGNY 

TGKVGhTOALQYHNOTAFSTKDKDNDNCLDKCA 

Q1JUCGGYWYNCCTDSNLNGVYYRLGEHNKHLD 

GITWYGWHG STYSLKRVEMKIRPEDFKP 
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♦NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino odd sequence (A=Aianine OCysteine, D»Aspartic Acid, 
E<=G!utamic Add, ^Phenylalanine, G^Glydne, H=Hist!d Ine, 
Msoleudne, K= Lysine, L^Leudae, MHVfethionine, 
N^Asparagioc, P»ProIine, Q=*Glutamf ne, R^ArginJne, S^Serlne, 
T«Threonine, V«Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *«Stop codon, /^possible nudeob'de deletion, 
V=possible nudeotide insertion 


3663 


A 


64 


1456 


LSSAKJETI^QMYNTVWNMEDLDLEYAKTDINC 

GTDLMFYIEMDPPALPPKPPKPTTVANNGMNNN 

MSLQDAEWYWGDISREEVNEKLRDTADGTFLV 

RDASTKMHGDYTLTLRKGGNNK1JKIFHRDGKY 

GFSDPLTFSSVVELINHYRNESLAQYNPKLDVKL 

LYPVSKYQQDQWKEDMEAVGKKIJHEYNTQFQ 

EKSREYDIU-YEEYTRTSQEIQMKRTAIEAFNETIK 

IFEEQCQTQERYSKEYIEKFKREGNEKEIQRIMHN 

YDKLKSRISEIIDSRRRLEEDL.KKQAAEYREIDKR 

MNSIKPDLIOLRKTRIX>YLMWLTOKGVROKKL 

NEWLGNENTEDQYSLVEDDEDLPHHDEKTWNV 

GSSNRNKAEhHLXRGKRIXJTFLVRESSKQGCYAC 

SWVDGEVKHCVINKTATGYGFAEPYNLYSSLK 

EL VLHYQHTSL VQHND SLNVTLA YPV YAQQRR 


3664 


A 


944 


406 


GATVEDQSCNFGSLRWVVSVPHISARSCPDPLLS 

RTGRVPGGRGAGLPRHHSPRCCLQVFFNGANVR 

QVDVPTLTGAFGILAAHVPTLQVLRPGLVWHA 

EDGTTSKYFVSSGSIAVNADSSVQLLAEEAVTLD 

MLDLGAAKANLEKAQAELVGTADEATRAEIQIR 

EEANEALVKALE 


3665 


A 


98 


1388 


A SQL AFGGKLTSTPSRDFQGCGRG A VTCCSFHEH 

RJHQSGRCLSTGMAPNLKGRPRKKKPCPQRRDSF 

SGVKDSNNNSDGKAVAKVKCEARSALTKPKNN 

HNCKKVSNEEKPKVAIGEECRADEQAFLVALYK 

YMKERKTPIERIPYLGFKQINLWTMFQAAQKLG 

G YETITARRQ WKHIYDELGGNPGS TSAATCTRR 

H YERLILP YERFEK.GEEDKPLPPIKPRKQEN S S QE 

NEhfKTKVSGTKRIKHEIPKSKKEKENAPKPQDAA 

EVSSEQEKEQETLISQKSIPEPLPAADMKICKIEGY 

QEFSAKPLASRVDPEKDNETDQGSNSEKVAEEA 

GEKGPTPPLPSAPLAPEKDSALVPGASKQPLTSPS 

ALVDSKQESKLCCFTESPESEPQEASFPRLPHHTG 

HRWQTRMRRRMTNCPPWQITLPTAP 


3666 


A 


113 


1492 


LLQEMCTTCTIPVLWGCFLLWNLYVSSSQTIYPGI 

KARITQRA1J5YGVQAGMKMDBQMLKEKKLPDL 

SGSESLEFLKVDYVNYNFSNIKISAFSFPNTSLAF 

VPGVGDCALTNHGTANISTDWGFESPLFVLYNSF 

AEPMEKPttJCNLNEMLCPIIASEVKALNANLSTLE 

VLTKIDNYTLLDYSLISSPEITENYLDLNLKGVFY 

PLENLTDPPFSPVPFVLPERSNSMLYIGIAEYFFKS 

ASFAHFTAGVFNVTLSTEEISNHFVQNSQGLGNV 

LSR1AEIYILSQPFMVRIMATEPPIINLQPGNFTLDI 

PASIMMLTQPKNSTVETIVSMDFVASTSVGLVIL 

GQRLVCSLSLNRFRLALPESNRSNIEVLRFENILSS 

ILHFGVLPLANAKLQQGFPLPNPHKFLFVNSDIEV 

LBGFLUSTDLKYETSSKQQPSFHVWEGLNLISRQ 

WRGKSAP 


3667 


A 


1 


181 


FRGRJLGSGKNGGGSMNAPPAFESFLLFEGEKITIN 
KDTKVPNACLFTINKEDHTLGNIIK 


3668 


A 


212 


431 


VAGEAVPFFPMMYSEPLKPSYLALVLWYFLLTG 
YCnXPEVTFKIEQGEEPWILEKGFPSQCHPAXYL 
WCLHD 


3669 


A 


458 


1056 


FSGVCFAGIAGSMATLLHDAVMNPAEVVKQRJLQ 
MYNSQHRSAISCIRTVWRTEGLGAFYRSYTTQLT 
MNIPFQSIHFITYEFLQEQVNPHRTYNPQSHIISGG 
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SEQ ID 
NO: 


Method 


1 Predicted 

1 beginning 

I nocicouuv 

1 location 
corresponding 
to first amino 

1 acid residue of 

1 peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequeoce 


Amtoo acid sequence (A=Alaoine OCysteine, D= As par tic Acid, 
E=GIutarolc Acid, ^"Phenylalanine, OGIycine, H=Histidlne, 
I-Isoleuclne, K^Lysine, L^Lcudne, M=M ethlonine, 
N»Asparagine, P=Prollne, Q=Glutaraine, R=»Arginine, S= 3 Serine, 
T-Thrtonlne, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










LAGALAAAA1TPLDVCKTLLNTQENVALSLANIS 
GRLSGMANAFRTVYQLNGLAGYFKGIQARVIYQ 
MPSTAISWSVYEFFKYFLTKRQLENRAPY 


3670 


A 


145 


298 


W^CPLTrTLPSTLMVLLLSLI^'FSALTFHSICQLKN 
TGVEVDIVFQRVSFL 


3671 


A 


3 


462 


ILKVAKKERTMSSLPVPYKLPVSLSVGSCVIIKGT 

PIHSFINDPQLQVDFYTDMDEDSDIAFRFRVHFG 

NHVVMNRREFGIWMLEETTDYVPFBDGKQFELC 

IYVHYNEYEIKVNGHTHLRALSHRIPPSFVEDGC 

KCPRRYLPWTSVCVCN 


3672 


A 


i 

1 ^ 


1028 


HYAKLGTRPRLKFMSSPSLSDLGKREPAAAADE 

RGTQQRRACANATWNSIHNGVIAVFQRKGLPDQ 

ELFSLNEGVRQLLKTELGSFFTEYLQNQLLTBCGM 

VILRDKIRFYEGQKLLDSLAETWDFFFSDVLPML 

QAIFYPVQGKEPSVRQLALLHFRNATTLSVKLED 

AXARAHARVPPAIVQMLLVLQGVHESRGVTEDY 

LRLETLVQKVVSPYLGTYGLHS SEGPFTHSCILEK 

R1XRRSRSGDVLAKNPVVRSKSYNTPLLNPVQE 

HEAEGAAAGGTSIRRHSVSEMTSCPEPQGFSDPP 

GQGPTGTFRSSPAPHSGPCPSRLYPTTQPPEQGLD 

PTRS 


3673 


A 


2 


712 


RPPRVWYPELRELSAAAPRWSHRTAPGIMVFYF 

TSSSVNSSAYTIYMGKDKYENEDLDCHGWPEDI 

WFHVDKLSSAHVYLRLHKGENIEDIPKEVLMDC 

AHLVOCANSIQGCKMNNVNVVYTPWSNLKKTAD 

MDVGQIGFHRQKDVKIVTVEKKVNEILNRLEKT 

KTVERFPDLAAEKECRDREER>JEKKAQIQEMKKR 

EKEEMKKKREMDELRSYSSLMKVENMSSNQDG 

NDSDEFM 


3674 


A 


2 


712 


RPPRVWYPELREl^AAAPRWSHRTAPGIMVFYF 

TSSSVNS SAYTIYMGKDKYENEDLIKHG WPEDI 

WFHVDKLSSAHVYLRLHKGENIEDIPKEVLMDC 

AHLVKANSI(^CKMNhfVNVVYTPWSI^KKTAI> 

MDVGQIGFHRQKDVKIVTVEKKVNEILNRLEKT 

KVERFPDLAAEKECRDREERNEKKAQIQEMKKR 

EKEEMKKXREMDELRSYSSLMKVENMSSNQDG 

NDSDEFM 


3675 


A 


921 


1321 


VTLAKMRVHISSCLKVQEQMANCPKFVPVVPTS 
QPIPSNIPNRSTFACPYCGARNLDQQELVKHCVE 
SHRSDPNRWCPIC S AMPWGDPSYKSANFLQHL 
LHRHKFSYDTFVDYSIDEEAAFQAALALSLSEN 


3676 


A 


3 

• 


1856 


TLGRWLLGVYETVAPTLACLPRPRIJtRRRRRRR 

RRMISRYTRKAWQSLEIJCGITKHALNHHPPPEK 

LEEISPTSDSHEKDTSSQSKSDITRESSFTSADTGN 

SLSAFPSYTGAGISTEGSSDFSWGYGELDQNATE 

KVQTMFTAIDELLYEQKLSVHTKSLQBECQQWT 

ASFPHLRILGRQIITPSEGYRLYPRSPSAVSASYET 

TLSQERDSUFGIRGKKLHFS SS YAHKASSIAKSSS 

FCSMERDEEDSIIVSEGIIEEYLAFDHIDIEEGFHG 

KJKSEAATEKQKLGYPPIAPFYCMKJEDVLAYVFD 

SVWCKWSCMEQLTRSHWEGFASDDESNVAVT 

RPDSESSCVLSELHPLVLPRVPQSKVLYTTSNPMS 

LCQASRHQPNVNDLLVHGMPLQPRNLSLMDKLL 

D1JDDKLLMRPGSSTELSTRNWPNRAVEFSTSSLS 

YTVQSTRRRNPPPR'n-HPISTSHSCAETPRSVEEIL 
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SEQ ID 

NO: 


1 Method 


Predicted 
beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


' Predicted end 
nucleotide 

| location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteJne, D=Aspartic Acid, 
E=GIutamic Add, (^Phenylalanine, G=GIycine, H-HLsttdine, 
IHLsoftucinc K^OLvsJne, L^Leucine, M^Methioninc* 
N=>Asparagine, P^Proline, Q=Glutamine, R=Argin$ne, S°Serlnc, 
T=tfhreonine, V«Valine, W=Tryptophan, Y-Tyroslne, 
X«Uoknown, *^Stop codon, /^possible nudeotide deletion, 
V=possible nudeotide insertion 










RGARVPVAPDSLSSPSPTPLSRNNLLPPIGTAEVE 

HVSTVGPQRQMKPHGDSSRAQSAVVDEPNYQQ 

PQERLLLPDFFPRPNTl^SFLLDTQYRRSCAVEYP 

HQARPGRGSAGPQLHGSTKSQSGGRPVSRTRQG 

P 


3677 


A 


246 


757 


MRLQGAIFVLLPHLGPILVWLFTRDHMSGWCEG 

PRMLSWCPFYKVLLLVQTAIYSVVGYASYLVWK 

DLGGGLGWPLALPLGLYAVQLTISWTVLVLFFT 

VHNPGLALLHLLLLYGLWSTALIWHPINKLAAL 

LLLPYLAWLTVTSALTYHLWRDSLCPVHQPQPT 

EKSD 


p678 


A 


20 


1508 


RGKAEFFLAMAGTNALLMLENFIDGKFLPCSSYI 

DSYDPSTGEVYCRVPNSGKDEIEAAVKAAREAFP 

SWSSRSPQERSRVLNQVADLLEQSLEEFAQAESK 

DQGKTLALARTMDIPRSVQNFRFFASSSLHHTSE 

CTQMDHLGCMHYTVRAPVGVAGLISPWNLPLY 

LLTWKIAPAMAAGNTVIAKPSELTSVTAWMLCK 

LLDKAGVPPGWNIVFGTGPRVGEALVSHPEVPL 

ISFTGSQPTAERITQLSAPHCKKLSLELGGKNPAH 

FEDANLDECIPATVRSSFANQGEICLCTSRIFVQK j 

SIYSEFLKRFVEATRKWKVGIPSDPLVSIGAL1SK 

AHLEKVRSYVKRALAEGAQIWCGEGVDKLSLPA 

RNQAGYFMLPTVITDDCDESCCMTEEIFGPVTCV 

VPFDSEEEVIERANNVKYGLAATVWSSNVGRVH 

RVAKKLQSGLVWTNCWLIRELNLPFGGMKSSGI 

GREGAKDSYDFFTEIKTITVKH 


3679 1 


A 


1862 


502 


MAGTKPYMEIQTTIREYYEHLYANKLENLEEMD 

KFLDTYTLPRLNQ EE VESLNRPITG SEIE AIIN SLP 

TKKIPGPDRFTAKJ^QRYKEELSNLIHYLGI^HH 

LLALNFIIVSFGKKSAWSSAQVKVTDTDFDGVEV 

RV7EGPPKPEEPLKRS WYIHG GGWALASAKIRY 

YDELCTAMAEELNAV1VSIEYRLVPKVYFPEQIH 

DVVRATKYFLKPEVLQKYMVDPGRICISGDSAG 

GNUVAALGQQFTQDASLKNKLKLQALIYPVLQA 

LDFmPSYQQNVOTPILPRYVMVKYWVDYFKG 

NYDFVQAMIVhnfflTSLDVEEAAAVRARLhrWTS 

LLPASFTKNYKPWQTTGNARIVQELPQLLDARS 

APLIADQAVLQLLPKTYELTCEHDVLRDDGIMYA 

KRLESAGVEVTLDHFEDGFHGCMIFTSWPTNFSV 

GIRTRNSYIKWLDQNL 


3680 


A 


249 


2146 


RSWGAPWFWRMRLLRRRHMPLRLAMVGCAFV 

IJ^FLLHRDVSSREEATEKPWLKSLVSRKDHVLD 

LMLEAMNNLRDSMPKLQERAPEAQQTLFSINQSC 

LPGFYTPAELKPFWERPPQDPNAPGADGKAFQK 

SKWTPLETQEKEEGYKKHCFNAFASDRISLQRSL 

GPDTRPPECVDQKFRRCPPLATTSVIIVFHNEAWS 

TLLRTVYSVLHTTPAILLKEIILVDDASTEEHLKE 

KLEQYVKQLQVVRWRQEERKGLITARLLGASV 

AQAEVLTFLDAHCECFHGWLEPLLARIAEDKTV 

VVSPDIVTIDLNTFEFAKPVQRGRVHSRGNFDWS 

LTFGWETLPPHEKQRRKDETYPDCSPTFAGGLFSI 

SKSYFEfflGTYDNQMEIWGGENVEMSFRVWQC 

GGQLEIIPCSVVGHVFRTKJSPHTFPKGTSVIARNQ 

VRLAEVWMDSYKKJFYRKNLQAAKMAQEKSFG 

DISERLQLREQLHCHOTSWYLHKVryTEMFVPDL 



411 



WO 01/57190 PCIYUS01/04098 



SEQID 
INU: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

odd residue of 

peptide 

sequence 


Predicted end 
ouclcoode 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A«Alonlne OCystdne, D=Aspartic Acid, 
E=Glutamic Add, ^Phenylalanine, G^GIydnc, H=HlstIdine, 
I=IsoleucJne ( K=Lysine, L»Leutine, M=Meihionine, 
N^Asparagine, P=Proline, Q=Glutamine, R«=Arginine, S=Serine, 
T=Tbreonine, V«Vallne, W«Tryptophan, Y»Tyrosine, 
X=Unknown, *=Stop codon, /^possible nudeotide deletion, 
^possible nudeotide insertion 










TPTFYGAIKNLGTNQCLDVGENNRGGKPLIMYS 
CHGLGGNQYFEYTTQRDLRHNIAKQLCLHVSKG 
ALGLGSCHFTGKNSQVPKDEEWELAQDQLIRNS 
GSGTCLTSQDKKPAMAPCNPSDPHQLWLFV 


3681 


A 


2982 


1869 


LKDTLKSQMTQEASDEAEDMKEAMNRMIDELN 

KQVSELSOLYKEAQAELEDYRKRKSLEDVTAEY 

IHKAEHEKLMQLTNVSRAKAEDALSEMKSQYSK 

VLNELTQLKQLVDAQKENS V SITEHLQ V1TTLRT 

AAKEMEEKISNLKEHLASKEVEVAKLEKQLLEE 

KAAMTDAMVPRS SYEKLQSSLESEVS YLASKLK 

ESVKEKEKVHSEWQIRSEVSQVKREKENIQTLL 

KSKEQEVNELLQKFQQAQEELAEMKRYSESSSK 

LEEDKDKKINEMSKEVTKLKEALNSLSQLSYSTS 

SSKRQSQQLEALQQQVKQLQNQLAECKKQHQE 

VISVYRMHLLYAVQGQMDEDVQKVLKQILTMC 

KNQSQKK 


3682 


A 


447 


1024 


AQALTAGRQLALAAPFIAPISPISLPRLNPPSQSW 

NSTPFFKVKLPPQKEVITSDELMAHLGNCLLSIKP 

QEKSEGLQLNFQQNVDDAMTVLPKLATGLDVN 

VRFTGVSDFEYTPECSVFDLLGIPLYHGWLVDPQ 

QSPEAVRAVGKLSYNQL/VGEDHHLQTLQ*HQP 

RDRKPDCRA VPGDHRGP SDL PRTV 


3683 


A 


2 


942 


IJEIKQEEKFVGQCIKEELMHGECVKEEKDFLKKE 

IVDDTKVKEEPPINHPVGCKRKLAMSRCETCGTE 

EAKYRCPRCMRYSCSLPCVKKHKAELTCNGVRD 

KTAYISIQQFTEMNLLSDYRFLEDVARTADHISR 

DAFLKJlPISNKYMYFMK>mARJ^QGINLKJLLPNG 

FTKRKENSTFFDKKKQQFCWHVKLQFPQSQA\ST 

*KKRVPDDKTINEILKPYIDPEKSDPVIRQRLKAYI 

RSQTGVQILMKIEYMQQNLVRYYELDPYKSLLD 

NLRNKVnEYPTLrTV^LKGShmDMKVIJHlQVKSE 

STKNVGNEN 


3684 


A 


119 


1533 


SLQENVQEKRVRVCPGLGGLLPNGTPSITAAAAP 

QVLWRHVQPGCSHHLHACVIRAACRAGEGHAD 

RHAGPPET/PVTLPSSWPWSSPWERQCPMH\L*AP 

GHAFRPVF1EHRRGWAALGHHRAAAGPLREPAS 

GSQPAPASC*PECHHGCPEQTRQCQDLLREAW 

APEQRG*PCAHLQT*ATATTLCPQVPAGRVWQP 

GHSCHLLPHRHDGSH*HHCAAHRRPVTRRQAAH 

GVPLPDACYSPHHTLPAAPPPATRPAGHTATHPE 

* GGDLTPVPDGPHDCPRDVQGIPG AGGGSQLAPC 

CPPFPAAPVSVQGTQGLGPKNVLH*QWEGIRWQ 

KEPE/PGPPPEVELKRGAKCRIGDHGLGAVLGQG 

EYAS*SPSIPW*ASSSACPPLHPTT/TVYTQSPAAA 

PGWTRPPSP/PPPGLYPGP/PASHAPGVRGGISHQL 

YSLP*LCRECCSCP/PPPPAHGGRCPSLLPPEALAK 

LLL 


3685 


A 


101 


438 


AWVLQCKJNTELQTEVVMLKSMVLVVLGEQVQS 
LQLQQQLHCHFhmTfflC\nTNLEYN\KEYPWE>LV 
KAHLQGASTSNITFDIGELQKKULDLNKQTQEFQ 
PSL*AWTEFQQGLE 


3686 


A 


105 


845 


VSDWKNQLVEVQCRQDGCDAVENVHQMFMF 
hWFTlXXWTlJI^NYQPSVESSSPGGSATSDDHE 
FDPSADMLVHDFDDERTLEEEEMMEGETNTFSSEI 
EDLAREGDMPIHELLSLYGYGSTVRLPEEDEEEE 
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r s£0 id 

1 NO* 

I • 


1 Method 


Predicted 

a. • • 

begi Doing 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


I Predicted end 

j nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alaninc OCysteine, D=Aspartic Acid, 
E e Glutamlc Acid, Jr^xoenyiaianine, i»=\jiycine, H^Hisnaine, 
I=lsoleucine, K^Lysine, L=Leucine, M=Methtonine, 
N=Asparagine, P^Proline, Q=Glutamine t R^Arginine, S=Scrine, 
T^Threonlne, V=Valine, W«Tryptophan, Y«Tyrosine, 
X=Unknown, *^Stop codon, A=possibIe nucleotide deletion, 
V=possible nucleotide insertion 










EEEEEGEDDEDADNDDNSGCSGENKEENIKDSS 
GQEDETQSSNDDPSQSVASQDAQEURPRRCKYF 
DTNSEVEEESEEDEDYIP/SnSFFQSSDGI*SSSSSE 
DWKKEIMVGS 


3687 

I 


A 


49 

* * 


1225 


PVLVTSLRMREADTLRPPQLMEVSADnSTVEFN 

HTGELLATGDKGGRWIFQREPESKNAPHSQGE 

YDVYSTFQSHEPEFDYLKSLEIEEKINKIKWLPQQ 

NAAHSLLSTNDKTDCLWKITERDKRPEGYNLKDE 

EGKLKIJLSTVTSLQVPVLKPMDLMVEVSPRRIFA 

NGHTYHmSISVNSDCETYMSADDLRINLWHLAI 

TDRSFTPXNTVDIKPANMEDLTEVrrASEFHPHHC 

NLFVYSSSKGSLRLCDMRAAALCDKHSKLFEEPE 

DPSNRSFFSEnS\SVSDVKFSHSDRYMLTR\DYLT 

VKVWDLNMEARPIETYQVHDYLRSKLCSLYEND 

CIFDKraCAWNGSDR/IIMTGAYNNFFRMFDRNT 

KRD VTLEASRGS SKPRAVL 


3688 


A 


1 


401 


KKVPGRLSEMSFSLlSIFTLPANri'SSPV'nDCGPSL 
GLAAGIPLLVATAIXVALLFTLMRRRSSIEAMEE 
SDRPCEISEIDDNPKISENPRRSPTHEKNTMGAQE 
AHTYVKTVAGSEEPVHDRYRPTffiMERRR 


3689 


A 


698 


889 


GRVLVHCAMGVSRSATLVLAFLMIYENMTLVEA 
IPDGAGPPQISALTQAFVRQLQVLDNRLGRE 


3690 


A 


1 61 


153 


MGAHLVRRYLGDASVEPDPLQMPTFPPDYGF 


3691 


A 


' 61 


153 


MGAHLVRRYLGDASVEPDPLQMPTFPPDYGF 


3692 


A 


3 


2831 


PLVRRLLRQTLRRVGGARAVREAVMRAVLTWR 

DKAEHCIND1AFKPDGTQLILAAGSRLLVYDTSD 

GTLLQPLKGHKDTVYC VA YAKDGKRFASG SAD 

KSVUWTSKJ.EGILKYTHNDAIQCVSYNPITHQLA 

SCSSSDFGLWSPEQKSVSKHKSSSIOICCSWTNDG 

Q YLALGMFNGDSIRNKKGEEKVKIERPGG SLSPI 

WSICWNPSSRWESFWMNRENEDAEDVIVNRYIQ 

EIPSTLKSAVYSSQGSEAEEEEPEEEDDSPRDDNL 

EERNDILA VAD WG\QKVSFYQL SGKQIGKDRAL 

NFDPCCISYFTKGEYILLGGSDKQVSLFTKDGVR 

LGTVGEQNSWVWTGQAKPDSNYWGGCQDGTI 

SFYQLIFSTVHGLYKDRYAYRDSMTDVIVQHLIT 

EQKVRIKCKELVKKIAIYRKRLAIQLPEK1LIYELY 

SEDI^DMHYRVKEKIDCKFECNLLVVCANHIILC 

QEKRLQCLSFSGVKEREWQMESLIRYIKVIGGPP 

GREGLLVGLKNGQILKIFVDNLFAIVLLKQATAV 

RC1X>MSASRKKI^VVDENDTCLVYDIDTKELLF 

QEPNANSVAWNTQCEDMLCFSGGGYLKIKASTF 

PVHRQKLQGFVVGYNGSKIFCLHVFSISAVEVPQ 

SAPMYQYLDRKLFKEAYQIACLGVTDTDWRELA 

MEALEGLDFETAKKERKKRGETNNDLFLADVFS 

YQGKFHEAAKLYKRSGHENLALEMYTDLCMFE 

YAKJDFLGSGDPK^TKMLITKQADWARNIKEPKA 

AVEMYISAGEHVKAIEICGDHGWVDMLIDIARK 

LDKAEREPLLLCATYLKKLDSPGYAAETYLKMG 

DLKSLVQLHVETQRWDEAFALGEKHPEFKDDIY 

MPYAQWLAENDRFEEAQKAFHKAGRQREAVQV 

I^QLTNNAVAESRFNDAAYYYWMLSMQCLDIA 

QDPAQKD 


[3693 " 


A "3 


! 


1099 


SSFPTCMRTWHSNTSVSSLLHRPGHVTPQLTIHG 
GWRHHRDHTAIDEWDFNPSKFLIYTCLLLFSVLL 
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1 SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

Dcoiide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
seauence 


Amino acid sequence (A»Alaolne C=Cysteine, D»Aspartic Acid, 
E°Glutamic Add, ^Phenylalanine, G-Glycine, H=HIsridine, 
l a lsoleudne, K=Lyaine, LHLeucine, M=Methionlne, 
N*»Asparagine, P*»Proline, Q~Glutarnine, R-Arginlne, S=Serine, 
T-Threonine, V-Valine, W-Tryptophan, Y-Tyroslne, 
X-Unknown, *=Stop codon, /"possible nucleotide deletion, 
Vpossible nucleotide Insertion 










PUaDGnQWSYWAVFAPIWLWKIXVVAGASVG 

AGVWARNPRYRTEGEACVEFKAMLIAVGIHLLL 

LMFEVLVCDRVERGTHFWLLVFMPLFFVSPVSV 

AACVWGFRHDRSLELEILCSVNILQFIFIALKLDRI 

IHWPWLVWVPLWILMSFLCLVVLYYIVWSLLFL 

RSLDWAEQRRTrTVTMAISWITrVVPLLTFEVLL 

VHRLDGHNTFSYVSIFVPLWLSLLTXMAlUJh'RRK 

GGNHWWFAIRRDF/CQDQLPQPTGKPPPPPLTDH 

HGEKALPLQNKDRG S WPASRGSPRLL 


3694 


A 


483 


761 


PRSLIDYKSYMDTKLLVARFLEQSSCTMTPDIHE 
LVENIKSVLKSDEEHMEEAITSASFLEQIMAHSX 
QHIRAHKLPXETAGLXTSELRXLTP 


3695 


A 


483 


761 


PRSLIDYKSYMDTKLLVAJRFLEQSSCTMTPDIHE 
LVENIKSVLKSDEEHMEEAITSASFLEQIMAHSX 
QHIRAHKLPXETAGLXTSELRXLTP 


3696 


A 


456 


733 


LSAALWEEPILSLWSETKELTNRGKMNYPQIGPH 
RPHVKGLRVRPGPGTLSNAPKSLCPGMSNSDRGI 
HXGGEGQGPGKRAGHLGRGGGMSFL 


3697 


A 


877 


1873 


VWL*TLS*HTCALMTVCRSCLVKYLEENNTCPT 
CRJVIHQSHPLQYIGHDRTMQDfVYKLVPGLQEA 
EMRKQREFYHKLGMEVPGDIKGETCSAKQHLDS 
HRNGETKADDSSNKEAAE 


3698 


A 


1 


572 


KQCGIPHEWRDENSSVYAEVSRLLLATGHWKR 

LRRDNPRFNLMLGERNRLPFGRLGHEPGLVQLV 

NYYRGADKLCRKASLVKLIKTSPELAESCTWFPE 

SYVIYPTNLKTPVAPAQNGIQPPISNSRTDEREFFL 

ASYNRKKEDGEGKVWIAKSSAGAKVWVQW*M 

TDLEEEIDIPSPVGLGLESEWPL 


3699 


A 


2008 


2432 


LHCKMGALETQTHPCSQNMLRSLQKCCCKVEE 

HHLQPVQVLQTLLHSATAGTGCRRPARPPPAPPT 

PTPWRSRQSGKQSERAS'LKGRGRYGLGALGGR 

GGRALGGSRWPPPLPGETLFSGCKHRRRRRGSD 

AAPGEEAGT 


3700 


A 


33 


1318 


GYQIGMALASGPARRALAGSGQLGLGGFGAPRR 

GAYEWGVRSTRKSEPPPLDRVYEIPGLEPITFAG 

KMHFVPWLARPIFPPWDRGYKDPRFYRSPPLHE 

HPLYKIXJACYIFHHRCRLLEGVKQALWLTKTKL 

IEGLPEKVLSLVDDPRNHIENQDECVLNVISHARL 

WQTTEEIPKRETYCPVIVDNLIQLCKSQILKHPSL 

ARRICVQNSTFSATWNRESLLLQVRGSGGARLST 

KDPLPT1ASREEIEATKNHVIJETFYPISPIIDLHECN 

IYDVTQvTOTGFQEGYPYPYPHTLYLLDKANLRPH 

RLQPDQLRAKMILFAFGSALAQARLLYGNDAKV 

LEQPVVVQSVGTDGRVFHFLVFQLM1TDLDSNE 

GVKNLAWVDSDQLLYQHFWCLPVIKKRVWEP 

VGPVGFKPETFRKFLALYLHGAA 


3701 


A 


86 


465 


WTLCGPEAGMVG YDPKPDGRNNTKFQ V AV AG S 
VSGLVTRALISPFDVIKIRFQLQHERLSRSDPSAK 
YHGILQASRQILQEEGPTAFWKGHVPAQILSIGY 
GAVQFLSFEMLTELVHRGSVYDARE 


3702 


A 


166 

■ 


814 


GFWEKTNQSSHSMDPLGAPSQFVDVDTLPSWGD 
SCQDELNSSDTTAEIFQEDTVRSPFLYNKDVNGK 
WLWKGDVALLNCTATVNTSNESLTDKNPVSESI 
FMLAGPDLKEDLQKLKGCRTGEAQLTKGFNLAA 
RFIIHTVGPKYKSRYRTAAESSLYSCYRNVLQLA 
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SEQ ID 

NO: 


Method 


I Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
add residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
I to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteinc, D-Aspartic Add, 
E=Glutomic Add, ^Phenylalanine, C=Glycine, H«Hlstidine, 
I=IsoIcucine, K=*Lysine, L^Leucine, M^Methionine, 
N»Asparagine, F=ProIine, 0=Glutaraine, R°Arginine, S=Serine, 
T=Ttireonine, V^Valine, W=Tryptophan, Y=Tyroslne, 
X=Unknown, *=»Stop codon, /^possible nucleotide deletion, 
V»possible nudcotide insertion 










KEQSMSSVGFCVINSAKRGYPLKDATHIALRTVR 
RFLEIHGETIEKW 


3703 


A 


128 


1255 


SLGPSPKSATIPCCGDTMAPF.F.f)AGGEALGGSFW 

EAGNYRRTVQRVEDGHRLCGDLVSCFQERAKIE 

KAYAQQLADWARKWRGTVEKGPQYGTLEKAW 

HAFFTAAERLSALHLEVREKJLQGQDSERVRAWQ 

RGAFHRPVLGGFRESRAAEDGFRKAQKPWLKRL 

KEVEASKKSYHAARKDEKTAQTRESHAKADSA 

VSQEQLRKLQERVERCAKEAEKTKAQYEQTLAE 

LHRYTPRYMEDMEQAFETCQAAERQRLLFFKD 

MLLTLHQHLDLSSSEKFHELHRDLHQGIEAASDE 

ED1JWWRSTHGPGMAMNWPQFEEWSLDTQRTI 

SRKEKGGRSPDEVT1.TSIVPTRDGTAPPPQSPGSP 

GTGQDEEWSDEESP 


3704 


A 


1 


271 


ARGEDLALATGGGPDTVTHSNMFCFNSLVYDC 

WLNDCECSVGEHTFEDLGLCPGRNQREKKRSYK 

DFLREEEKIAAQVRNSSKKKLKDSE 


3705 


A 


170 


1318 


LNWANLV1MWPREEEKEKVQDYSLGGLSPDLRI 

DVSRKKKILKAYDEDEDEDLYPDIHPPPSLPLPG 

QFTCPQCRKSFTRRSFRPNLQLANMVQIIRQMCP 

TP YRGNRSNDQGMC FKHQE ALKLFCE VDKE AI C 

WCRESRSHKQHSVLPLEEWQEYKAKLQGHVE 

PLRKHLEAVQKMKAKEERRVTELKSQMKSELA 

AVASEFGRLTRFLAEEQAGLERRLREMHEAQLG 

RAGAAASRLAEQAAQLSRLLAEAQERSQQGGLR 

LLQDIKETFNRCEEVQLQPPEVWSPDPCQPHSHD 

FLTDAIVRKMSRMFCQAARVDLTLDPDTAHPAL 

MLSPDRRGVRLAERRQEVADHPKRFSADCCVLG 

AQGFRSGRHYWEVCMGP 


3706 


A 


204 

• 


1996 

• 


SRERQTTWMDHNFAPAPPEMQSHGAPGPGTSFS 

HSHVLGRPIRPSRLPGGGSPLTPVLRKTIHLDTFP 

QSHBPQTSSRLGLGARTRSVPPQETGIALGASLSP 

LPTS SLVPRKLSSISLTLHQNSQARSLDRPLSHWE 

EU>TPGKKAAPHEGGRVSSPGSPPVTLVPGGRVH 

SEGPGNPGLTKSNRMLATEKPLVSSYLALPFQSR 

LAQSAPVLAEPGSLGQGHLVSVTDHMPTRASPG 

KGKPRARGIPRPRGRLQRAM 1 "1 VNLTAMDTRTD 

AARHLATMATNRPSLAINLATPNTSQLDTGTEFP 

ALDIKLGTARDLSSVGTVKSGKTVNLATAGTIKP 

GTAMNLTTVGTTKPGMVMDLIASEPDKLGKAM 

ATRSTAKPDMTTEGIAMDSATSDPVKPDTITATV 

GTSRLETAMALARVNRAKLGTAKNSLALDTSR 

MGTAVGSVVPVTPDPATGKTTLGSVNNLTTSDV 

ATC1XMPSRSTOLALDNTNAAMDRATBPASLDL 

ATEYKGKCRNLVGDGLGCREGEVCELGDGSMK 

PMSINSNLLGYIGIDTIIEQMRKKTMKTGFDFNIM 

VVGTEGCGAAAGLVAGSTKDPISFPQ 


3707 


A 


3 


549 


SSSISRDFLGQAACASGTMLRWLRDFVLPTAACQ 

DAEQPMRYETLFQALDRNGDGWDIGELQEGLR 

NLGIPLGQDAEEKIFTTGDVNKDGKLDFEEFMKY 

LKDHEKKMKLAFKSLDK^INDGKIEASEIVQSLQ 

TLGLTISEQQAELILQSIDVDGTMTVDWNEWRD 

YFLFNPVTDIEEIIR 


3708 


A 


1 


1866 


EFRGAGRANMLAPRGAAVLLLHLVLQRWLAAG 
AQATPQVFDLLPSSSQRLNPGALLPVLTDPALND 
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NO; 


Method 


| Predicted 
1 beginning 
j nucleotide 
I location 
1 corresponding 
I to first amino 
I acid residue of 

peptide 
j sequence 


| Predicted end 
1 nucleotide 

location 
1 corresponding 
1 to last amino 
1 acid residue of 
1 peptide 

sequence 


Amino acid sequence (A<=*Alanine OCysteinc, D=Aspnrtic Acid, 
E=Clutamic Add, F»PbenyIalaolne, &=Glycine, H=»H1stidine, 
I=Iso!eudne, K=Lysfue, L*=Leuctne, M=Mcthiouine, 
N^Asparagine, P=ProIine, Q^GIutamine, R a Argintnc, S=S trine, 
*MThreonine, V=»Valine, W»Tryptophan, Y»Tyroslne, 
X=Unknown, *=Stop codon, /= possible nucleotide deletion, 
V=possible nucleotide insertion 










LYVISTFKLQTKSSAT1FGLYSSTDNSKYFEFTVM 

GRLSKAILRYLKNDGKVHLVVFNNLQLADGRRH 

RILLRLSNLQRGAGSLELYLDCIQVDSVHNLPRA 

FAGPSQKPETIELRTFQRKPQDFLEELKLVVRGSL 

FQVASLQDCFLQQSEPLAATGTGDFNRQFLGQM 

TQLNQLLGEVKDLLRQEVNETSFLRNTITECQAC 

GPLKFQSPTPSTVVPPASPAPPTOPPRRCDSKPCF 

RGVQCTDSRDGFQCGPCPEGYTGNGITCIDVDEC 

KYHPCYPGEHC1NLSPGFRCDACPVGFTGPMVQ 

GVGISFAKSNKQVCTDIDECRNGACVPNSICVNT 

LG S YRCGPCKPG YTGDQIRG CKA ERNCRNPELN 

PCSVNAQCBEERQGDVTCVCGVGWAGDGYICGK 

DVDIDSYPDEELPCSARNCKKDNCKYVPNSGQE 

DADRDGIGDACDEDADGDGILNEQDNCVLIHNV 

DQRNSDKDIFGDACDNCLSVLNNDQKDTDGDG 

RGDACDDDMDGDGIKNILDKCPKFPNRIKJRDK 

DGDGVGDACDSCPDVSNPNQ 


3709 


A 


144 


417 


TQAMEGLLHYINPAHAISLLSALNEERLKGQLCD 
VLLIVGDQKFRAHKNVLAASSEYFQSLFTNKENE 
SQTVFQLDFCEPDAFDNVLNYTY 


3710 


A 


245 


688 


FGMLKNKGHSSKKDNLAVNAVALQDHILHDLQ 

LRNI^VADHSKTQVQKKENKSLKRDTKAHDTGL 

KKTTQCPKLEDSEKEYVLDPKPPPLTLAQKLGU 

GPPPPPLSSDEWEKVKQRSLLQGDSVQPCPICKE 

EFELRPQVFSIRG 


3711 


A 


3 


773 


SLEMSSDGEPLSRMDSEDSISSTIMDVDSTISSGRS 
TPAMMNGQGSTTSSSKNIAYNCCWDQCQACFNS 
SPDLADHIRSIHVDGQRGGVFVCLWKGCKVYNT 
PSTSQSWLQRHMLTHSGDKPFKCVVGGCNASFA 
SQGGLARHVPTHFSQQNSSKVSSQPKAKEESPSK 
AGMNKRRKLKNKRRRSLARPHDFFDAQTLDAIR 
HRAICFNLSAHIESLGKGHSVVFHSTVSILLFFQIK 
YKTLQKNISHISKSLKI 


3712 


A 


2 


344 


RATWHNAGKEREAVQLMAGAEKRVKASHSFLR 
GLFGGOTRIEEACEMYTRAANMFKMAKNWSAA 
GNAFCQAAKLHMQLQSKHDSATSFVDAGNAYK 
KADPQGKTARHVACYLCV 


3713 


A 


20 


974 

« 


GAAATACSSSS SSSGAPATWAAHGPGKD VASPS 

SVSLSPRRSRLLVLRCGLRRNPERPSSSPALRRLL 

LLLLLLLLLLLGFLLSPGPERGVGGGRFGRRLAL 

LWAAALGHWSGKVMSRRAPGSRLSSGGGGGG 

TNYSRSWNDWQPRTDSASADPGNLKYSSSRDRG 

GSSSYGLQPSNSAWSRQRHDDTRVHADIQNDE 

KGGYSVNGGSGENTYGRKSLGQELRVNNVTSPE 

FTSVQHGSRALATKDMRKSQERSMSYCDESRLS 

Y1XRMTRENDRDRRLATVKQLKEFIQQPENKLV 

LVKQLDILAAVHDVLNER 


p714 


A 

• 


237 


458 


IFALKSPSYLLPCCTPEGKMDHKQLCWSHPQKSG 

QSSRSCCICSNQHGLIWKYSLNMCLQCCHQYVK 

DIGFKL 


3715 


A 


970 


1524 


LCTLSPGISGTAGSCLTTEPGTELGTSFAQNGFYH 

EAWLFTQALKLNPQDHRLFGNRSFCHERLGQP 

AWALADAQVALTLRPGWPRGLFRLGKALMGLQ 

RFREAAAVFQETLRGGSQPDAARELRSCLLHLTL 

QGQRGGICAPPLSPGALQPLPHAELAPSGLPSLRC 
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| SEQID 
I NO: 


Method 


T Predicted 

1 Beginning 

I nucleotide 

1 location 
corresponding 
to first amino 
add residue of 
peptide 
sequence 


Predicted end 

UUUCvUUC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Add, 
iv^OIUluuiIC AClUf r Deny ia ia nine, \»™\jiycinc, tl— HlSuQinc, 
l^Isolcucine, KpLysine, L^Leudne, M=Mctbioaine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, ' 
^Threonine, V~Vallne, W°Tryptophan, Y^Tyrosine, 
X~Unknown, *«Stop codon, /^possible nucleotide deletion, 
V=possible nudeotide insertion 










PRSTALRSPGLSPLLH 


3716 


A 


85 


308 


QGLPSTMVKLGCSFSGKPGKDPGDQDGAAMDS 

VPLISPLDISQLQPPLPDQW1KTQTEYQLSSPDQQ 
NYTKSR 


3717 


A 


58 


618 


GAGCTSPGLWARKAAARCLPTYPSRAQPSNVGR 

RRRRRPGLGALAAGVPAMAESVERLQQRVQELE 

RELAQERSLQVPRSGDGGGGRVRIEKMSSEVVD 

SNPYSRLMALKRMGIVSDYEKIRTFAVAIVGVGG 

VGSVTAEMLTRCGIGKLLLFDYDKVELANMNRL 

FFQPHQAGLSKVQAAGHTPEE 


3718 


A 

• 


3 

ij * 


593 


RGAGGRAGGRADGQPNMADQRQRSLSTSGESL 

YHVLGLDKNATSDDDCKSYRKLALKYHPDKNPD 

KPEAADKFKEINNAHAILTDATXRNTYDKYGSLG 

LYVAEQFGEENVNTYFVLSSWWAKALFVFCGLL 

TCCYCCCCLCCCFNCCCGKCKPKAPEGEETBFY 

VSPEDLEAQLQSDEREATDTPIVIQPASATEP 


3719 


A 


2 


2173 


SGGVRMGSRADGPRTSGHVTGKMAVFPWHSRN 

RNYKAEFASCRLEAVPLEFGDYHPLKPITVTESK 

TKKVNRKGSTSSTSSSSSSSWDPLSSVLDGTDPL 

SMFAATADPAALAAAMDSSRRKRDRDDNSVVG 

SDFEPWTNKRGEILARY'rr i'HKLSINLFMGSEKG 

KAGTATLAMSEKVRTRLEELDDFEEGSQKELLN 

LTQQDYVNRIEELNQSLKDAWASDQKVKAPKN 

VHPGKLVYERIFSMCVDSRSVLPDHFSPENANDT 

AKETCLNWFFKIASIRELIPRFYVEASELKCNKFLS 

KTGISECLPRLTCMIRGIGDPL\GSVYARAYLVSRV 

GMEVAPHLKETLNKNFFDFLLTFKQIHGDTVQN 

QLWQGVELPSYLPLYPPAMDWIFQC1SYHAPEA 

LLTEMMERCKKLGNNALLLNSVMSAFRAEFIAT 

RSMDFIGM3KBCDESGFPKHLLFRSLGLNLALAD 

PPESDRI^ILNEAWKVITKLKNPQDYINCAEVWV 

EYTCKHFTKREVNTVLADVIKHMTPDRAFEDSY 

PQLQLIIKXVIAHFHDFSVLFSVEKFLPFLDMFQK 

ESVRVEVCKC1\RTPLSSINKSPPRTRSS*MPFCMF 

ARPCMTL/CNALTLEDEKRMl^YLINGFIKMVSF 

GRDFEQQLSFYVESRSMFCNLEPVLVQLIHSVNR 

LAMETEIKVMKGNHSRKTAAFVRSWGAYWFITIP 

SLAGIFTRLNLYLHSG 


3720 


A 


24 


296 


ENLFRAGFAFSLLRSSF YISKTYCS WFSNLISGSL 
ADFNSKGTRDYSPRQMAVRE/KVFDVDRCFKRH 
G AE VIDTP VFELKVRNG QEETTW 


3721 


A 

• 


2 


310 


PSCXTCVGHCSIGGSCTMIGIMMPECHCSLHMTG 
PRCEEHVFILQQPGHIASILIPLLVLLLLALVAGW 
FWHKRRVQGAKGFQHQRMTNGAMNVEIGNPTY 
K 


3722 


A 


75 


722 

• 


MELVAGCYEQVLFGFAVHPEPEACGDHEQWTL 

VADFTHHAHTA SLS A V A VNSRF WTG SKDETIHI 

YDMKKKIEHGALVHHSGTITCLKFYGNRHLISGA 

EDGLICIWDAKKWECLKSIKAHKGQVTFLSIHPS 

GKLALSVGTDKTLRTWNLVEGRSAFIKNIKQNA 

HIVEWSPRGEQYVVUQNKIDIYQLDTASISGTrrN 

EKRISSVKFLSES 


3723 


A 


110 


316 


MEL^DKRRSGGLEGLAEKCP>^TYLNLSGKKIK j 
DI^TVEALVSGTVLSLDLLFLVKFSEICLCLUSI 


3724 


A 


3 


406 


VDRGTEAWQRDPAFSGLQRVGGVDVSFVKGDS 
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SEQIO 
NO: 


Method 


! Predicted 
beginning 
nocleotide 
location 
corresponding 
to first amino 
add residue or 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sea u en ce 


Amino acid sequence (A^AIaninc OCystelne, D=Asparlic Add, 
E^Glutamic Acid, F«=Phenylalanlne, G=Glycinc, R=Histidinc, 
I»Isoleudnc, K=Lysine, L^JLcudnc, M=Methlonine, 
N°Asparagine, P=Proline, Q=Glutomine, RaArginlne, S^Serine, 
T«Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X»Unknow», **&top codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










VRAC A SLG VLSFPELEVVYEESRMVSLTAPYVSG 
FLAFREVPFLLELVQQLREKEPGLMPQVLLVDGN 
GVLHHRGFGVACHLGVLTDLPCVGVAKKLLQV 
DG 


3725 


A 


3 


406 


VDRGTEAWQRDPAFSGLQRVGGVDVSFVKGDS 
VRACASLG VLSFPELEWYEESRMVSLTAP YV SG 
FLAFREVPFLLELVQQLREKEPGLMPQVLLVDGN 
GVLHHRGFGVACHLGVLTDLPCVGVAKKLLQV 
DG 


3726 


A 

i 


1 


433 


SSDDRSLFRRLKLNYAIFDEGHMLKNMGSIRYQ 

HLNTTINANNRLLLTGTPVQNNLLELMSLLNFVM 

PHMFSSSTSEIKRMFSSKTKSADEQSIYEKERIAH 

AKQHKPHLRRVKEEVLKQLPPKKDRIELCAMSE 

KQEQLYLG 


3727 


A 


6 


383 


RJPRGKACXTVLGRSTGELEGFASSRLPPQPCGW 
GQSSDLLSRIDLDELMKKDEPPLDFPDTLEGFEY 
AFT^EKGQLRHIKTGEPFVFNYREHLHRWNQKRY 
EALGEHTKYVYELLEKDCNSKKVS 


3728 


A 


3 

• 


2452 


EIAGAAAENMLGSLLCLPGSGSVLLDPCTGSTISE 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSWSGISATSEDIP 

NKIEDLRSECSSDFGGKDSVTSPDMDE1THDFLYI 

LQPKQHFQHIEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLS SHEG AS A WRPKVHY ARPSHPPPD 

PPILEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 

RHS/YTPERLVRSRSS\DIVSSVRRPMSDPSWNRR 

PXGNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 

RGETEERKDSDDEKSDRNRPWWRKRFVSAMPK 

APIPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 

AQAQVAEDILDKYRNAIKRTSPSDGAMANYEST 

EVMGDGESAHDSPRDEALQNISADDLPDSASQA 

AHPQDSAFSYRDAKKKLRLALCSADSVAFPVLT\ 

HSTRNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 

KNTLMAQLQETMRCVCRFDNRTCRKLLA SIAEDY 

RKRAPYIAYLTRCRQGLQTTQAHLERLLQRVLR 

DKEV ANR YFTTVC VRLLLE SKEKKIREFIQDFQK 

LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 

EEQI^DAQLAIERSVMNRIrTCLAFYPNQDGDlLR 

DQVLHEmQRLSKVVTANHRALQIPEVYLREAP 

WPSAQSEIRTISAYKTPRDKVQCILRMCSTIMNLL 

SLANEDSWGADDFVPVLVFVLEKANPPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAAVEFIKTIDD 

RK 


3729 


A 


3 


2452 


EIAGAAAENMLGSLLCLPGSGSVLLDPCTGSTISE 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSWSGISATSEDIP 

NKIEDLRSEC S SDFG GKD S VTSPDMDEITHDFL YI 

LQPKQIiFQHIEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAWRPKVHYARPSHPPPD 

PPILEG A VGGNEARLPNFG SPMF* LPAEMEAFKQ 

RHS/YTPERLVRSRSSNDIVSSVRRPMSDPSWNRR 

PNGNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 

RGETEERKDSDDEKSDRNRPWWRKRFVSAMPK 

APIPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRtS 

AQAQVAEDILDKYRNAIKRTSPSDGAMANYEST 
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SEQED 
NO: 


Method 


| Predicted 
I beginning 

nucleotide 

location 
J corresponding 

to first amino 
1 acid residue of 
f peptide 
f sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


| Amino acid sequence (A-Alaninc 0=Cysteine, D=>Aspartic Acid, 
1 E=Glutamic Add, ^Phenylalanine, G-Grycine, H=Histf dine, 

I=Isoleudne, K»Lysine, L=Leudne, M-Mcthloninc, 
1 N^Asparagine, P*=ProKne, Q=£ltitamlnc, R^Arginine, S^Serlne, 

T«Threonine, V=Valine, W=Tryptopban, Y«Tyrosioe, 
1 X=Uo known, *=*Stop codon, /^possible nudeotide deletion, 
1 Vpossible nudeotide insertion 










EVMGDGESAHDSPRDEALQNISADDLPDSASQA 
AHPQDSAFSYRDAKKKLRLALCSADSVAFPVLT\ 
HSTRNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 
KNLMAQLQETMRCVCRFDNRTCRKLLASIAEDY 
RKRAPYIAYLTRCRQGLQTTQAHLERLLQRVLR 
DKEVANRYK1TVCVRLLLESKEKKIREFIQDFQK 
LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 
EEQLQDAQLAIERSVMNRIFKJLAFYPNQDGDILR 
DQVLHEHIQRLSKVVTANHRALQEPEVYLREAP 
WPSAQSEIRTTSAYKTPRDKVQCILRMCSTIMNLL 
SLANEDSVPGADDFVPVLVFVLIKANPPCLLSTV 
QYISSFYASCLSGEESYWWMQFTAAVEFIKTIDD 
1 RK 


3730 


A 


3 


2452 


EIAG AAAENMLG SLLCLPGSG S VLLDP CTGS USE 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSWSGISATSEDIP 

NKIEDLRSECSSDFGGKDSVTSPDMDEITHDFLYI 

LQPKQHFQHDEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAWRPKVHYARPSHPPPD 

PPILEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 

RHSATPERLVRSRSSVD1VSSVRRPMSDPSWNRR 

PXGNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 

RG£TEERKDSDDEKSDRNRPWWRKRFVSAMPK 

APIPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 

AQAQVAEDILDKYRNAKRTSPSDGAMANYEST 

EVMGDGESAHDSPRDEALQNISADDLPDSASQA 

AHPQDSAFSYRDAKKKLRLALCSADSVAFPVLT\ 

HSTRNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 

K^MAQLQETMRCVCRFDNRTCRKIXASIAEDY 

RKRAPYIAYLTRCRQGLQTTQAHLERLLQRVLR 

DKEVANRY KIT VCVRLLLESKEKKIREFIQDFQK 

LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 

EEQLQDAQLADBRSVMNRIFKLAFYPNQDGDILR 

DQVLHEfflQRLSKVVTANHRALQIPEVYLREAP 

WPSAQSEIRTISAYKTPRDKVQCILRMCSTIMNLL 

SLANEDSVPGADDFVPVLVFVLIKANPPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAAVEFIKTIDD 

RK 


3731 


A 

• 


1 


1305 


VNTAMHEAXLMEECDELVEnQQRKQMIAVKIK 

ETKVWaOURKXAQQVANCRQCLERSTVLINQAEH 

ILKENDQARFLQSAKNIAERVAMATASSQVLIPDI 

NFNDAFENFALDFSREKKLLEGLDYLTAPNPPSIR 

EELCTASHDTTTVHWISDDEFSISSYELQYTIFTGQ 

AOTISLYNSVDSWMIVPNIKQNHYTVHGLQSGTR 

YIFIVKAINQAGSRNSEPTRLKTNSQPFKLDPKMT 

HKKLKISNDGLQMEKDESSLKKSHTPERFSGTGC 

YVYGVLHNSDNS*MFISLSFPLSHRYAIGIAYKSA 

PKNEWIGKNASSWVFSR(>ISNFVVRHNNKEML 

VDVPPHLKRLGVLLDYDNY/NMLSFYDPANSL\H 

LHTFDVTFVILPVCPTFTIWNKSLMILSGLPAPDFI 

DYPERQECNCRPQESPYVSGMKTCH 


3732 


A 


127 


2832 


LGQRLSLVPRPSLKRRLGKRLSLGLRERMMSLW 
WS/GPKVRTQATTGARPKTETKSVPAARPKTEAQ 
AMSGARPKTEVQVMGGARPKTEAQGITGARPKT 
DARAVGGARSKTDAKAIPGARPKDEAQAWAQS 
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SEQIO 
NO: 


Method 


Fredtcted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
! sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=>Alanine OCysteint, D=Aspartic Acid, 
£>=Glutamlc Add, ^Phenylalanine, G<=G]ycine, H-Histidine, 
l=LsoIeucine, K=Lysine, L=Leucine, M=»Methionlne, 
N«Asparagine, P=Proline, Q=*Glutamine, R^Arginine, S=Serine, 
T*»Ttareonine, V«ValIne, W»Tryptophan, Y^Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 








* 


EFGTBAVSQAEGVSQTNAVAWPLATAESGSVTK 

SK\ACLWIEN*SMWM/PETFPGTQGQKGIQPWFG 

PGEETNMGSWCYSRPRAREEASNESGFWSADET 

STASSFWTGEETSVRSWPREESNTRSRHRAKHQT 

NPRSRPRSKQEAYVDSWSGSEDEASNPFSFWVG 

ENTNNLFRPRVREEANIRSKXRTNREDCFESESED 

EFYKQ S WVLPGEEAN\IDSGTETKKILILP WKLRA 

QKDVDSDRVKQEPRFEEEVIIGSWFWAEKEASLE 

GGASAICESEPGTEEGAIGGSAYWAEEKSSLGAV 

AREEAKPESEEEAIFGSWFWDRDEACFDLNPCPV 

YKVSDRFRDAAEELNASSRPQTWDEVTVEFKPG 

IjraGVGFRSTSPFGIPEEASEMLEAKPICNLELSPE 

GEEQESLLQPDQPSPEFTFQYDPSYRSVREIREHL 

RARESAESBSWSCSCIQCELKIGSEEFEEFLLLMD 

KIRDPFfflEISKIAMGMRSASQFTRDFIRDSGVVS 

LIETLLNYPSSRVRTSFl-ENMIHMAPPYFb^NMIE 

TnCQVCEETI^HSVDSLEQLTGNKGCTRHLTMT 

IDYHT\LMN*YGPGFPLLF*PQAQCGEIXFHVLK 

MLLNLSENPAVAKKLFSAKALSIFVGLFNIEETN 

DNIQIVIKMFQNISNIIKSGKMSLIDDDFSLEPLISA 

FREFEELAKQLQAQIDNQNDPEATGTTAFV GKG 

NNPSANRERLSPSVFCPGAQEAESLPARRVRGEE 

QRLLLEEVGARTADGIPEGW 


3733 


A 


2 


3274 


DVPLIRJEEDTGEIFTTGARIDREKLCAGIPRDEHC 

FYEVEVAILPDEIFRLVKIRFLIEDINDNAPLFPAT 

VINISIPENSAINSKYTLPAAVDPDVGINGVQNYE 

LIKSQNIFGLDVIETPGGDKMPQLIVQKELDREEK 

DTYVMKVKVEDGGFPQRSSTAILQVSVTDTNDN 

HPVFKETEDEVSIPENAPVGTSVTQLHATDADIGE 

NAKIHFSFSNL V SNI ARRLFHLNATTGLITIKEPLD 

REETPNHKLLVLASDGGLMPARAMVLVNVTDV 

M>KVTSIDIRYIVOTV>roTVVLSEhnPLNTKIAL^ 

VTDKDADHNGRVTCFTDHEIPFRLRPVFSNQFLL 

ETAAYLDYESTKEYAEKLLANADAGKPPLNQSAM 

LFKVKDENDNAPVFTQSFVTVSIPENNSPGIQLT 

KVSAMDADSGPNAKINYLLGPDAPPEFSLDCRT 

GMLTVVKKLDREKEDKYX.K1 '1L AKDNG VPPLTS 

NVTVFV SIIDQNDNSP VFTHNEYNFYVPENLPRH 

GTVGLITVTDPDYGDNSAVTLSILDENDDFTIDSQ 

TGVIRPN1SFDREKQESYTFYVKAEDGGRVSRSSS 

AKVTINVVDVNDNKPVFIVPPSNCSYELVLPSTN 

PGTVWQVIAVDNDTGMNAEVRYSIVGGNTRDL 

FAIDQETGNITLMEKCDVTDLGLHRVLVKANDL 

GQPDSLFSVVIVNLFVNESVTNATXINELVPQKH 

LKHQ*PQILEIADVSSPTSDYVKJLVAAVAGTITV i 

W VIFTI A WRCRQAPHLKAAQKNMQNSE WATP 

NPENRQMIMMKKKXKKKKHSPKNLLLNWTIEE 

TKADDVDSDGNRVTLDLPEDLEEQTMGKYNWV 

TTPTTFKPDSPDLARHYKSASPQPAFQIQPETPLN 

LKHHnQELPLDNTFVACDSISNCSSSSSDPYSVSD 

CGYPV'ITKEVPVSVHTRPPVDLEVGGAQSGQVAI 

LTSSLMELLLCLMV AAFLPLELRPLG QQNVMS W 

EQEAKJLLVGYWGDGEWCHFHFHHLBPGPVNPG 

YERKQYHILDSDSEDTQPSGELCPlPVRPFnLSIQ 

LLQDDGEHCGTKQGFQPAVQLGLLPHKTLK 
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SEQID 
NO: 


Method 


| Predicted 
1 beginning 
I nucleotide 

location 
1 corresponding 

to first amino 
1 acid residue of 
9 DtDtide 
1 sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCysteine, I>=Aspartic Acid, 
E=Clutamic Add, ^Phenylalanine, G=Glyeiue, H^Histidlne, 
I=IsoIeudne, K=Lysine, L^Leudne, M«Methionine, 
N»Asparagine,P«ProUne» Q=Glutamine, R»Arginine, S=Serine, 
T^ThreonJne, V=Valine, W=*Tryptopban, Y=^TyrosJne, 
X^Un known, *=Stop codon, /=possible nucleotide deletion, 
V=pos5ible nudeotide insertion 


3734 


A 


1 


840 


GTRPGHLPAPSDGFCV/HL* SIPS WGSF* GESL/EM 

QLITSLGLQEFDIAKNVLELIYAQTLVWIGIFFCPL 

LPHQNDMLFIMFYSKMSLMNINFQPPSKAWRAS 

QMMTFFEFLLFFPSFTGVLCTLAITTWRLKPSADC 

GPFRGLPLFIHSIYSWIDTLSTRPGYLWWWIYRN 

LIGSVHFFFILTLIVLnTYLYWQITEGRKIMIRLLH 

EQIINEGKDKMFLIEKLIKLQDMEKKANPSSLVLE 

RREVEQQGFLHLGEHDGSLDLRSRRSVQEGNPR 

A 


3735 


A 


2 


432 


VEVCRRYLWKMTVDASQNVQCCVIFSHFPFIFN 
NLSKIKLLHTDTLLKffiSKKHKAYLRSAAIEEERE 
SEFALRPTFDLTVRRNHLIEDYLNQL SQFENEDL 
RKELWVSFSGEIGYDLGGSAnCKEIFYCLFAEMIQ 
PEYGMFMY 


3736 


A 


1542 


343 


KGAPSFVRJLYQYPNFAGPHAALANKSFFKADKV 

TMLWNKKATAVLVIASTOVDKTGASYYGEQTL 

HYIATNGESAWQLPKNGPIYDWWNSSSTEFCA 

VYGFMPAKAT1FNLKCDPVFDFGTGPRNAAYYS 

PHGHILVLAGFGNLILQI*AD/IMKVWNVKNYKLI 

SKPVASDSTYFAWCPDGEHILTATCAPRLRVNN 

GYKIWHYTGSILHKYDVPSNAELWQVSWQPFLD 

GIFPAKTITYQAVPSEVPNEEPKVATAYRPPALRN 

KPITNSKLHEEEPPQNMKPQSGNDKPLSKTALKN 

QRKHEAKKAAKQEARSDKSPDLAPTPAPQSTPR 

NTVSQSISGDPEIDKKIKNLKKKLKAIEQLKEQAA 

TGKQLEKNQLEKIQKETALLQELEDLELGI 


3737 


A 


3190 


664 

• 


• VAMGTPRAQHPPPPQLLFLILLSCP WIQGLPLKEE 
EILPEPGSETPTVASEALAELLHGALLRRGPEMG 
YLPGPPLGPEGGEEETTTTirrriUV 11 1VTSPVLC 
NNNISEGEGYVESPDLGSPVSRTLGLLDCTYSIHV 
YPGYGIEIQVQTLNJLSQEEELLVLAGGGSPGLAP 
RLLANSSMLGEGQVLRSPTNRLLLHFQSPRVPRG 
GGFRIHYQAYLLSCGFPPRPAHGDVSVTDLHPGG 
TATFHCDSGYQLQGEETLICLNGTRPSWNGETPS 
CMASCGGTIHNATLGRIVSPEPGGAVGPNLTCR 
WVIEAAEGRRLHLHFERVSLDEDNDRLMVRSGG 
SPLSPVIYDSDMDDVPERGLISDAQSLYVELLSET 
PANPLLLSLRFEAFEEDRCFAPFLAHGNVTTTDPE 
YRPGALATFSCLPGYALEPPGPPNABECVDPTEPH 
WNDTEPACKAMCGGELSEPAGWLSPDWPQSY 
SPGQDCVWGVHVQEEKR1LLQVEILNVREGDML 
TLFDGDGPSARVLAQLRGPQPRRRLLSSGPDLTL 
QFQAPPGPPNPGLGQGFVLHFKEVPRNDTCPELP 
PPEWGWRTASHGDLIRGTVLTYQCEPGYELLGS 
DILTCQWDLSWSAAPPACQKIMTCADPGE1ANG 
HRTASDAGFPVGSHVQYRCLPGYSLEGAAMLTC 
YSRDTGTPKWSDRVPKCALKYEPCLNPGVPBNG 
YQTLYKHHYQAGESLRFFCYEGFELIGEVTITCV 
PGHPSQWTSQPPLCKVTQTTOPSRQLEGOILAL 
AILLPLGLVIVLGSGVYIYYTKLQGKSLFGFSGSH 
SYSPITVESDFSNPLYEAGDTREYEVSI 


3738 


A 


3190 


664 


VAMGTPRAQHPPPPQLLFLILLSCPWIQGLPLKEE 
EILPEPGSETPTVASEALAELLHGALLRRGPEMG 
YLPGPPLGPEGGEEETTTTOTTTTVTTTVTSPVLC 
NNNISEGEGYVESPDLGSPVSRTLGLLDCTY SIHV 
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SEQID 
NO: 


| Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide , 
sequence 


Amino acid sequence (A=Alanlne OCysteine, D=Aspartic Add, 
E=Glotamlc Acid, ^Phenylalanine, G^GIyclne, H-Histidine, 
I»IsoIeucine, K«Lysine, l>=Leucine, M=Methlonine, 
N^Asparagine, P=Proline, Q=Glutamlne, R»Argiaine, S=Serlne, 
"^Threonine, V«Valine, W«Tryptophan, Y-^Tyroslne, 
X=Unknown» *=Stop codoo, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










YPGYGIEIQVQTLNLSQEEELLVLAGGGSPGLAP 

RLLANSSMLGEGQVLRSPTNRLLLHFQSPRVPRG 

GGFRIHYQAYLLSCGFPPRPAHGDVSVTOLHPGG 

TATFHCDSGYQLQGEETLICLNGTRPSWNGETPS 

CMASCGGTTHNATLGRIVSPEPGGAVGPNLTCR 

WVTEAAEGRRLHLHFERVSLDEDNDRLMVRSGG 

SPLSPVIYDSDMDDVPERGLISDAQSLYVELLSET 

PANPLLLSLRFEAFEEDRCFAPFLAHGNVrriDPE 

YRPGALATFSCLPGYALEPPGPPNAIECVDPTEPH 

WNDTEPACKAMCGGELSEPAGVVLSPDWPQSY 

SPGQDCVWGVHVQEEKRILLQVEILNVREGDML 

TUTDGDGPSARVLAQLRGPQPRRRLLSSGPDLTL 

QFQAPPGPPNPGLGQGFVLHFKEVPKNDTCPELP 

PPEWGWRTASHGDLIRGTVLTYQCEPGYELLGS 

DJLTCQ WDLS WSA APPACQKIMTC ADPGEIANG 

HRTASDAGFPVGSHVQYRCLPGYSLEGAAMLTC 

YSRDTGTPKWSDRVPKCALKYEPCLNPGVPENG 

YQTLYKHHYQAGESLRFFCYEGFELIGEVTITCV 

PGHPSQWTSQPPLCKVTQTTDPSRQLEGGNLAL 

AILLPLGLVIVLGSGVYTYYTKLQGKSLFGFSGSH 

SYSPITVESDFSNPLYEAGDTREYEVSI 


3739 


A 


734 


445 


LLEPEPAEEYTEQSEVEST/EGMILI* CCLYFAAFQ 
TNVSNIYFALQYVNRQFMAETQFTSGEKEQVDE 
WTVETVEVRVLCI AKLLSLS SVSNFYLY 


3740 


A 


2 


1578 


MAHYITFLCMVLVLLLQNSVLAEDGEVRSSCRT 

APTDLVFILDGSYSVGPENFEIVKKWLV>n"nCNF 

DIGPKFIQVGWQYSDYPVLEIPLGSYDSGEHLTA 

AVESILYLGGhmCTGKAIQFALDYLFAKSSRFLT 

KIAVVLTDGKSQDDVKDAAQAARDSKITLFAIG 

VGSETEDAELRAIANKPSSTYVFYVEDYIAISKIR 

EVMKQKLCEESVCPTRIPVAARDERGFDILLGLD 

WKKVKKRIQLSPKKIKGYEVTSKVDLSELTSNV 

FPEGLPPSYVFVSTQRFKVKKIWDLWRILTIDG/* 

PQIAVTLNGVDKDLLFTTTSVINGSQWTFANPQV 

KTLFDEGWHQIRLLVTEQDVTLYIDDQQIENKPL 

HPVLGELINGQTQIGKYSGKEETVQFDVQKLRIY 

CDPEQNNRETACEIPGFCLNGPSDVGSTPAPCICP 

PGKPGLQGPKGDPGLPGNPGYPGQPGQDGKPVS 

TESLVISGISGITGYQGIAGTPGVPG SPGIQG ARGL . 

PGYKGEPGRDGDK 


3741 


A 


5048 


1236 


MSAPAGSSHPAASARIPPKFGGSAVSGAAAPAGP 

GAGPAPHQQNGPAQNQMQVPSGYGLHHQNYIA 

PSGHYSQGPGKMTSLPLDTQCGDYYSALYTVPT 

QNVTPNTVNQQPGAQQLYSRGPPAPHTVGSTLGS 

FQGAASSASHLHTSASQPYSSFVNHYNSPAMYS 

ASSSVASQGFPSTCGHYAMSTVSNAAYPSVSYPS 

U>AGDTYGQMFTSQNAPTVRPVKDNSFSGQNTA 

ISHPSPLPPLPSQQHHQQQSLSGYSTLTWSSPGLP 

STQDNLIRNHTGSLAVANNNKiriVADSLSCPVM 

QNVQPPKSSPWSTVLSGSSGSSSTRTPPTANHPV 

EPVTSVTQPSELLQQKGVQYGEYVNNQASSAPT 

PLSSTSDDEEEEEEDEEAGVDSSSTTSSASPMPNS 

YDALEGGSYPDMLSSSASSPAPDPAPEPDPASAP 

APASAPAPVVPQPSKMAKPLAMAIQHFSLVIRML 

QHHLFLEYSPSNPVYSGFQQYPQQYPGVNQLSSS 
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std in 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A» Alanine OOysteine, D=Aspartie Acid, 
E«GIutamic Acid, ^Phenylalanine, OGIycine, H=Hlstidine, 
I^Isoleucine, K^Lysine, L=Leucine, M^Methtonloe, 
N**Asparagine, ^Proline, Q=G luta mine, R°Arginine, S=Serine, 
T^Thrconine, V«VaIlne, W=Tryptophan, Y»Tyroslne, 
X=Unknown, *«*Stop codon, ^possible nucleotide deletion, 
Y-possible nucleotide insertion 










IGGLSLQSSPQPESLRPVNLTQERNILPMTPVWAP 

VPNLNADLKKLNCSPDSFRCTLTNBPQTQALLNK 

AKLPLGLLLHPFRDLTQLPVl'l-SNTIVRCRSCRTYI 

NP\FV SFEDQRR* KC>HLC YRVND VPEEFMYNPLT 

RS YGEPHKRPEVQNS\TVEF1AS SDYMLRPPQPAV 

YLFVLD V SHN A VE AG YLTI/L WCQSLLEVNTLDKLP 

G\DSRTVRIGFMTFD\STYSFLQFTQEGLSQPQMLI 

VSDIDDVFLPTPDSLLVNLYESKELIKDLLNALPN 

MFTNTRETHSALGPALQAAFKLMSPTGGRVSVF 

QTQLPSLGAGLLQSREDPNQRSSTKWQHLGPAT 

DFYKKLALDCSGQQTAVDLFLLSSQYSDLASLA 

CMSKYSAGCIYYYPSFHYTHNPSQAEKLQKDLK 

RYLTRKIGFEA VMRIRCTKGLSMHTFHGN FF VRS 

TDLLSLANINPDAGFAVQLSIEESLTDTSLVCFQT 

ALLYTSSKGERR1RVHTLCLPWSSLSDVYAGVD 

VQAAICLLANMAVDRSVSSSLSDARDALVNAVV 

DSLSAYGSTVSNLQHSALMAPSSLKLFPLYVLAL 

LKQKAFRTGTSTRLDDRVYAMCQIKSQPLVHLM 

KMIHPhJLYRIDRLTDEGAVHVNDRIVPQPPLQKL 

SAEKLTREGAFLMDCGSVFYIWVGKGCDNNFIE 

DVLGYTNFASIPQKMTHLPELDTLSSERARSFIT 

WLRDSRPLSPILHTVKDESPAKAEFFQHLIEDRTC 

AAFSYYEFLLHVQQQICK 


3742 


A 


934 


68 


SMLASQGVLLHPYGVPMIVPAAPYLPGLIQGNQE 
A AAAPDTMAQP YA S AQFAPPQNGIPAEYTAPHP 
HPAPEYTGQTTVPEHTLNLYPPAQTHSEQSPADT 
SAQTVSGTRNKQD* RSTDG WPSPKTQTS*KHGK 
QVSSPSGLHVSNIPFR\FRDPDLRQMF\GQFGKJLD 
VEIIFNERG SKGFGF VTFENS ADADRAREKVLHGT 
VV\EGRKI\EVN\NATARVMTNKXTVNPYTNGWK 
LNPVVGAVYSPEFYAGTVLLCQANQEGSSMYSA 
PSTDFRGAKLHTSRPLLSGS 


3743 


A 


3 


1456 


QFQQAWMQNKVPIPAPNEVLNDRKED1KLEEKK 

KTQAEIEQEMATLQYTNPQLLEQLKIERLAQKQV 

EQIQPPPSSGTPLLGPQPFPGQGPMSQIPQGF/PTA 

PSISADANEHGSXKGPPGPQGQFRPPGPQGQMGP 

QGPPLHQGGGGPQGFMGPQGPQGPPQGLPRPQD 

MHGPQGMQRHPGPHGPLGPQGPPGFQGSSGPQG 

HMGPQGPPGPQGHIGPQGPPGPQGHLGPQGPPGT 

QGMQGPPGPRGMQGPPHPHGIQGGPGSQGIQGP 

VSQGPLMGLNPKGMQGPPGPRENQGPAPQGMI 

MGHPPQEMRGPHPPGGLLGHGPQEMRGPQEIRG 

MQGPPPQGSMLGPPQELRGPPGSQSQQGPPQGSL ! 

GPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQ 

QQKTPLLGDGPRAPFNQEGQSTGPPPLIPGLGQQ 

GAQGRIPPLNPGQGPGPNKVS/ERGAPPRHEGRA 

PPRGRDGFPGPMKTLV 


3744 


A 


1571 


652 


PLTGRKCPGWTHSGSRRSPRIAEEVPGFPKRAEA 

SRQFSETADRLELLRRAVMAAARATTPADGEEP 

APEAEALAAARERSSRFLSGLELVKQGAEARVFR 

GRFQGR^VIKHRFPKGYRHPALEARLGRRRTV 

QEARALLRCRRAGISAPWFFVDYASNCLYMEE1 

EG S VTVRDMF SPL WRLKKTPQGLSNLAKTIG Q VL 

ARMHDEDLfflGDL'n^NMLLKPPLEQLNIVLIDF 

GLSFISAU>EDKGVDLYVLEKAFLSTHPNTETVFE 



423 



WO 01/57190 



PCT/US01/04098 



Amino add sequence (A»Alanine OCystdne, D=Aspartic Add, 
E=Glutamic Add, ^Phenylalanine, G«=Glydne, H=Histidine, 
I-lsoIcudne, K^Lysioe, L^Leudne, M=Methionioe, 
N=Asparagine, ^Proline, Q=Glutamine, R»Argiuine» S^Serine, 
T=Threonine, V»Valine, W«Tryptophan, YoTyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible audeotide iosertion 



SEQUD | Method 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
add residue of 
peptide 

sequence 



3745 



127 



Predicted end 
nudeotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



1433 



AFLKSYSTSSKiCARPVLKKLDEVRLRGKKRSMV 
G 



GSHRFSLASPLDPEVGPYCDTPTMRTLFNLLWLA 

LACSPVHTTLSKSDAKKAASKTLLEKSQFSDKPV 

QDRGLWTDLKAESWLEHRSYCSAKARDRHFA 

GDVLGYVTPWNSHGYDVTKVFGSKFTQISPVWL 

QLKJEOIGREMFEVTGLHDVDQGWMRAVRKHAK 

GL\P*CLGSCLRTGLTMISG/YVLDSEDEIEELSKT 

WQVAKNQHFDGFVVEVWNQLLSQKRVGLIHM 

LTHLAEALHQARLLALLVIPPAITPGTOQLGMFT 

HKEFEQLAPVLDGFSLMTYDYSTAHQPGPNAPL 

SWVRACVQVLDPKSRWRSKDLLGLNFYGMDYA 

TSKDAREPVVGARYIQTLKDHRPRMVWDSQVSE 

HFFEYKKSRSGRHVVFYPTLKSLQVRLELARELG 

VGVSIWELGQGLDYFVDLL*VGIAASAVDVFFSK 

PWSE 



3746 



898 



IDRAAECRTKPLPMAVSIRGNADSIVACLVLMVL 

YLIKKRLVACAAWYGFAVHMKIYPETYILPITL 

HLLPDRDNDKSLRQFRYTFQACL*ELLKRLCNRT 

ALMFVAVAGLTFFALSFGFYYEYGWEFLEHTYF 

YHLTRRDIRHNFSPYFYMLYLTAESKWSFSLGIA 

AFLPQLILLSAVSFAYYRDLVFCWFLHTSIFVTFN 

KVCTSQYFLWYLCLLPLVMPLVRMPWKRAVVL 

LMLWFIGQAMWLAPAYVLEFQGKNTFLFIWLA 

GLFFLLINCSILIQDSHYKEEPLTERIKYD 



3747 



2325 



MVISFQGLVTFGDVAVDFSQEEWEWLNPIQRNL 

YRKVMLENYRNLASLGLCVSKPDVISSLEQGKEP 

WTVKRKMTRAWCPDLKAVWKIKELPLKKDFCE 

GKLSQAVITERLTSYNLEYSLLGEHWDYDALFET 

QPGLVT1KNLAVDFRQQLHPAQKNFCKNGIWEN 

NSDLGSAGHCVAKPDLVSLLEQEKEPWMVKREL 

TGSLFSGQRSVHETQELFPKQDSYAEGVTDRTSN 

TKLDCSSFRENWDSDYVFGRKLAVGQETQFRQE 

PITHNKTLSKERERTYNKSGRWFYLDDSEEKVH 

NRDSIKNFQKSSW1KQTGIYAGKKLFKCNECKK 

TFTQSSSLTVHQRIHTGEKPYKCNECGKAFSDGS 

SFARHQRCHTGKKPYECIECGKAFIQNTSLIRHW 

RYYHTGEKPFDCIDCGKAFSDfflGLNQHRRIHTG 

EKPYKCDVCHKSIARYGSSLTVHQRIHTGEKPYE 

CDVCRKAFSHHASLT\Q\HQRVHSGEKPFKCKEC 

GKAFRQNIHLASHLRIHTGEKPFECAECGKSFSIS 

SQLATHQRIHTGEKPYECKVCSKAFTQKAHLAQ 

HQKTHTGEKPYECKECGKAFSQTTHLIQHQRVH 

TGEKPYKCMECGKAFGDNSSCTQHQRLHTGQRP 

YECIECGKAFKTKSSLICHRRSHTGEKPYECSVC 

GKAFSHRQSLS VHQRIHS GKKP YECKECRKTFIQI 

GHLNQHKRVHTGERSYNYKKSRKVFRQTAHLA 

HHQRIHTGESSTCPSLPSTSNPVDLFPKFLWNPSS 

LPSP 



3748 



823 



GGYTKSGYDSACKDFVPHDLEVQIPGRVFLVTG 

GNSGIGKATALEIAKRGGTVHLVCRDQAPAEDA 

RGEIIRE\SGNQNIFLHIVDI^DPKKIWKFVENFKQ 

EHKLHVLVVNNAGCMVNKREAHKXMDFEKKFG 

CQYSGVCTFXTTRPDPLCWRKNTDPRVIT\VSSG 

GMLVQKLNNQ* SPVRKNTIWMGTMVYAQNKVS 
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SEQ LD 
NO: 


1 Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
add residue of 
peptide 
1 sequence 


Predicted end 
nucleotide 
location 
corresponding 
to lost amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, D-Aspartfc Acid, 
E=Glutamlc Acid, ^Phenylalanine, OGIydne, H=Histidine, 
I=IsoIeudne, K=Lysine, L^Leudne, M=Methionine, 
N a Asparagine, P^ProIine, Q=*Gtutamine, R°Arginine, S~Serine, 
T=Threonine, V=Valine, W«Tryptophan, Y«Tyrosine, 
X-Un known, *«Stop codon, /^possible nucleotide deletion, 
\=possIb!c nucleotide insertion 










ERQQVVLTVERWGPRAPG\IHFSSMHPGWA\DTPG 
VRQAMPGFHVQASGYRLRSEAQGADTMLWLAL 
SSARSRTAQRP 


3749 

| • 


A 


1939 


715 


GFLRLSQAT\RQRLSIPVMVLTLDPTRD\QCFGDR 

FSRLLLDEFLGYDDILVMSSVKGLAENEBNKGFLR 

NVVSGEHYRFV\SMWMART\SYLAAFANHGQSF 

TLSVSHACCGYSHHQIFVFIVDLLQMLEMNMAIA 

FPAAPLLTVILALVGMEAIMSEFFNDTTTAFYIILI 

VWLADQYDAICCHTSTSKRHWLRFFYLYHFAFY 

AYHYRFNGQ YS SL ALVTS WLFIQHSMI YFFHHYE 

LPAILQHVRIQ\EMLLQAPTLGPGTPTA\LPDDMN 

NNSGAPATAFSDSAGQPPALGPVSPGASGSPGPV 

AAAPSSLVAAAASVAAAAGGDLGWMAETAAIIT 

DASFLSGLSASLLERRPASPLGPAGGLPHAPQDS 

VPPSDSAASDTTPLGAAVGGPSPASMAPTEAPSE 

VGS 


3750 


A 


2 


844 


GLLEPFSKLLSFV1QNAVFTLAYLVELCGLCYRA 

FTKERDKFYLSRSVVLELLQALKLKSPLPDTNLL 

LLVQFICADAGTKLAESTILSKQM1ASVPGCGTA 

AMECVKQ YINEVLDFMVADMrTTLTKLKS HMKTC 

SQPLHEDTFGGHLKVGLAQ1AAMDISRGNHRDN 

KAVIRYLPWLYHPPSAMQQGPKEFIECVSHIRLL 

SWLLLGSLTHNAVC/LKWPPLPGLPIPLDAGSHV 

ADHLIVILIGFPEQSKTSVL\HMCSLFHAF\SLAQL 

WDSLLARQSGRW 


3751 


A • 


431 


2 


AJFTRKCEETAFIVPQCEIIPTE/WVCRRIPTGSSLER 
NPGVKEGCEFCPPKVEMFFKDDANHDPQWSRQ 
QLIAAKFGFA ALGI/QTEVDIMSHAT* A VFEIPEKS 
RL\PQNCTP\05MKIEFGVHVTSKEILTDVIDNDS* 
RHSPS 


3752 1 


A 


131 


1278 


AWSGSGLLVLCINTASMPMISVLGKMFLWQREG 
PGGRWTCQTSRRVSSDPA WA VE WIELPRGLSLS S 
LG S ARTLRG WSRS SRPSS VDSQDLPEVNVGDTV 
AMLPKSRRALHQEIAAL ARS SLHGIS Q WKDHV 
TKPTAMA QGR VAHLIE WKG WSKPS DSPAALES A 
FSSYSDLSEGEQEARFAAGVAEQFAIAEAKLRA 
WS S VDGEDSTDD S YDEDFAG GMDTDMAG QLPL 
GPHLQDLFTGHRFSRPVRQGSVEPESDCSQTVSP 
DTLCSSLCSLEDGLLGSPARLA\PSCWAMSCFSPN 
CPPAGKVPSAAW/APLEAQDSLYNSPLTESCLSP 
AEEEPAPCKDCQPLCPPLTGSWERQRQASDLASS 
GVVSLDEDEAEPEEQ 


3753 


A 


3 


1138 


YYSSVRQRVTCEEPRFRECAAAL1EGSATEVYAG 

EWRADRRSGFGVSQRSNGLRYEGEWLGNRRHG 

YGRTTRPDGSREEGKYKRNRLVHGGRVRSLLPL 

ALRRGKVKEKVDRAVEGARRAVSAARQRQEIA 

AARAADALLKAVAASSVAEKAVEAARMAKLIA 

QDLQPMLEAPGRRPRQDSEGSDTEPLDEDSPGV 

YENGLTPSEGSPELPSSPASSRQPWRPPACRSPLP 

PGGDQGPFSSPKAWPEEWGGAGAQAEELAGYE 

AEDEAGMQGPGPRDGSPLLGGCSDSSGSLREEE 

GEDEEPLPPLRAPAGTEPEPIAMLVLRGSSSRGPD 

AGCLTEELGEPAATERPAQPGAANPLVVGAVAL 

LDLSLAFLFSQLLT 


3754 | 


A 


2 


3338 


SSLLEKMTSSDKDFRFMATSDLMSELQKDSIQLD 
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5JEQID | Method 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 



Predicted end 
nucleotide 
locatloo 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino add sequence (A^Alanine OCystdne, D»Aspartic Add, 
E-Glutamic Acid, ^Phenylalanine, G^GIycine, H»Histidine, 
Msoleuclne,K»Lysine, L=Leucine, [^Methionine, 
N=Asparagine, P=»ProUne, Q=Glutamlne, RpArgininc, S=Serinc, 
T-Tbreonine, V«VaKne, W«Tryptophan, Y-Tyrosine, 
X«Unknown, *«Slop codon, /"possible nucleotide deletion, 
V=possible nucleotide insertion 



EDSEj^VVKMLLRLLEDKNGEVQMLAVKWLGV 

PLGAFHASLLHCLLPQLSSPRLAVRKRAVGALGH 

LATACSTDLFVELADHLLDRLPGPRVPTSPTAIRT 

LIQCLGSVGRQAGHRLGAHLDRLVPLVEDFCNL 

DDDELRESCLQAFEAFLRKCPKEMGPHVPNVTS 

LCLQY1KHDPNYNYDSDEDEEQMETEDSEFSEQE 

SEDEYSDDDDMSWKVRRAAAKC1AALISSRPDL 

LPDraCTLAPVlJRRFKEREENVKADVFTAY^ 

LRQTRPPKG WLEAMEEPTQTG SNLHML RGQ VPL 

WKALQRQLKDRSVRARQGCFSLLTELAGVLPG 

SLAEHMPVLVSGIIFSLADRSSSSTIRMDALAFLQ 

GLLGTEPAEAFHPHLPILLPPVMACVADSFYKIA 

AEALVVLQELVRALWPLHRPRMLDPEPYVGEMS 

AVTLARLRATDLDQEVKERAISCMGHLVGHLGD 

RLGDDLEPTLLLLLDRLRKEITRLPAIKALTLVAV 

SPLQLDLQPILAEALHILASFLRKNQRALRLATLA 

ALDALAQSQGLSLPPSAVQAVLAELPALVNESD 

MHVAQLAVDFLATVTQAQPASLVEVSGPVLSEL 

LRLLRSPLLPAGVLAAAEGFLQALVGTRPPCVDY 

AKLISLLTAPVYEQAVDGGPGLHKQVFHSLARC 

VAALSAACPQVEAESTASRLVCDARSPHSSTGVK 

VLAFLSLAEVGQVAGPGHERELKAVLLEALGSPS 

EDVRAAASYALGRVGAGSLPDFLPFLLEQIEAEP 

RRQYLLLHSLKEALGAAQPDSLKPYAEDIWALL 

FQRCEGAEEGTRGWAECIGKLVLVNPSFLLPRL 

RKQLAAGRPHTRSTVITAVKFLISDQPHPIDPLLK 

SFIAVHNKPSLVRDLIX>DILPLLYQETKJRRDLIRE 

VEMGPFKHTVDDGLDVRKAAFECMYSLLESCLG 

QLDICEFLNHVEDGLKDHYDIRMLTFIMVAR1^\T 

LCPAPVLQRVDRLDEPLRATCTAKVKAGSVKQEF 

EKQDELKRSAMRAVAALLTIPEVGKSPIMADFSS 

QIRSNPELAALFESIQKDSTS APSTDSMEL S 



3755 



3338 



SSLLEKMTS SDKDFRFMATSDLMSELQKDSIQLD 

EDSERKVVKMLLRLLEDKNGEVQNLAVKWLGV 

PLGAFHASLLHCLLPQLSSPRLAVRKRAVGALGH 

LATACSTDLFVELADHLLDRLPGPRVPTSPTAIRT 

LIQCLGSVGRQAGHRLGAHLDRLVPLVEDFCNL 

DDDELRESCLQAFEAFLRKCPKEMGPHVPNVTS 

LCLQYUCHDPNYNYDSDEDEEQMETEDSEFSEQE 

SEDEYSDDDDMSWKVRRAAAKCIAALISSRPDL 

LPDraCTLAP\nLIRRFKJEREENVKADVFTAYIVL 

LRQTRPPKGWLEAJvIEEPTQTGSNLHMLRGQVPL 

V\f^KALQRQLKDRSVRARQGCFSLLTELAGVLPG 

SUVEHMPVLVSGIIFSLADRSSSSTtRMDALAFLQ 

GIXGTEPAEAFHPHLPDJLPPVMACVADSFYKIA 

AEALWLQELVRALWPLHRPRMLDPEPYVGEMS 

AVTLARlJtATDLDQEVKERAISCMGHLVGHLGD 

RLGDD1JBPTLLLLLDRLRNEITRLPAIKALTLVAV 

SPLQLDLQPILAEALHDLASFLRKNQRALRLATLA 

ALDALAQSQGLSLPPSAVQAVLAELPALVNESD 

MHVAQLAVDFLATVTQAQPASLVEVSGPVLSEL 

LRLLRSPLLPAGVLAAAEGFLQALVGTRPPCVDY 

AKLISLLTAPVYEQAVDGGPGLHKQVFHSLARC 

VAALSAACPQ\EAESTASRLVCDARSPHSSTGVK 

VLAFLSLAEVGQVAGPGHERJBLKAVLLEALGSPS 
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| SEQW 


| Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 
1 to first amino 
1 add residue of 

peptide 
[ sequence 


| Predicted eod 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanlne OCystdne, D=Aspartic Add, 
{^Glutamic Add, F=PnenylalanIne, G=Glydne, H=Hbtidinc, 
I«Isoleudnc, K=Lysine, L»Leudne, M^Methionlnc, 
N=Asparagine, P~Proline, Q=€lutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaHne f W=»Tryptophan, Y-Tyrosine, 
X=Un known, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide Insertion 










EDVRAAASYALGRVGAGSLPDFLPFLLEQIEAEP 

RRQYLLLHSLKEALGAAQPDSLKPYAEDIWALL 

FQRCEGAEEGTRGVVAECIGKLVLVNPSFLLPRL 

RKQLAAGRPHTRSTVITAVKFLISDQPHPEDPLLK 

SFIAVHNKPSLVRDLLDDILPLLYQETKIRRDLIRE 

VEMGPFKHTVDDGLDVRKAAFECMYSLLESCLG 

QLDICEFLKFTVEDGLKDHYDIRMLTFIMVARLAT 

LCPAPVLQRVDRLDEPLRATCTAKVKAGSVKQEF 

EKQDELKRSAMRAVAALLTIPEVGKSPIMADFSS 

QIRSNPELAALFESfQKDSTSAPSTDSMELS 


3756 


A 


112 


1361 


SLEEQQGRHPSFAPKCASQILGRIM1TLITEQLQK 

QTLDELKCTRFSISLPLPDHADISNCGNSFQLVSE 

GASWRGLPHCSCAEFQ/DQPQLQLPSLRPEPAPQ 

TT\HRGNSPKEQPFSQVLRPEPPDPEKLPVPPAPPS 

KRHCRSl^WVDLSRWQPVWRPAPSKLWTPIKH 

RGSGGGGGPQVPHQSPPKRVSSL/SVPPSSQCLFS 

MCPSSHTLQPSFLQPGPGPSDSSRPCAASPQSGSW 

ESDAESLSPCPPQRRFSLSPSLGPQASRFLPSARSS 

PASSPELPWRPRGLRNLPRSRSQPCDLDARKTGV 

KRRHEEDPRKLRPSLDFDKMNQKPYSGGLCLQE 

TAREGSSISPPWFMACSPPPLSASCSPTGGSSQVL 

SESEEEEEGAVRWGRQALSKRTLCQRDFGDLDL 

NLIEEN 


3757 


A 


413 


1 


PKPMLQQDFT/SLPDQGLDHIAE/NSYFDARSLCA 
AELVCKEWQQVTSE*MLWKKLffiRMVHAYPLW 
KGLSEKVW/DQHLFKNRPTDGPPNSFHRSLYPKU 
QVIETIESNWQCG*HTLQRIQCHSEKSKGVYCLQ 
YDDEK 


3758 


A 


2 1 


613 


FVSGSPWRMDGSTERLEARRPAGRLPWSSRQEM 

TRRPSLMAGRQHGWSAQQSATVANPVPGANPD 

LLPHFLGEPEDVYIVKNKPVLLVCKAVPATQIFF 

K(^GEWVRQVDHVIERSTOGSSGLPTMEVRJNV 

SRQQVEKVFGLEEYWCQCVAWSSSGTTKSQKA 

YIRIAYLRKNFEQEPLAKEVSLEQGIVLPCRPPEGI 

PPAE 


3759 


A 


1 


561 


ADDTLHLWNLRQKRPAILHSLKFCRERVTFCHLP 

FQSKWLYVGTERGNIHIVNVESFrLSGYVIMWN 

KAIELSSKSHPGPWHISDNPMDEGKLLIGFESGT 

VVLWDLKSKKADYRYTYDEAIHSVAWHHEGKQ 

nCSHSDGTLTIWNVRSPAKPVQTTTPHGKQLKD 

GKKPEPCKPILKVEFX1TR 


3760 


A 


1 


824 


LPACRCGCVAGCPSNHGICRCLRASERQVCVMH 

LKHLRTLLSPQDGAAKVTCMAWSQNNAKFAVC 

TVDRWLLYDEHGERRDKFSTKPADMKYGRKS 

YMVKGMAFSPDSTKIAIGQTD>niYVYKIGEDWG 

DKKVICNKFIQTVKFRPVPGTLG*TNIYQYIYL*IQ 

PGVAFLTSECDFSYCKDGASWLFMVICCLP*SPA 

VSFPIGD*\SAVTCLQWPAEYnVFGLAEGKVRLS 

NTKTNK SSTI YGTES YVVSLTTNCSGKGIL SGHA 

DGYQR 


3761 


& : 


2253 


320 


PVIQRCSQPYGFSLLISFFLKCVSETSQQPPSRKVF 

QLLPSFPTLTRSKSHESQLGNRIDDVSSMRFDLSH 

GSPQMVRRDIGLSVTHRJPSTKSWLSQVCHVCQK 

SM1FGVKCKHCRLKCHNKCTKEAPACRISFLPLT 

RLRRTESVPSDINKPVDRAAEPHFGTLPKALTKK 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanlne OCysteinc, D»Aspartic Add, 
&=G1utamic Acid, ^Phenylalanine, G=Glycine, H=HiStidine, 
Islsolcudne, K=Lysine, L^Leucine, M=Methionine, 
N»Asparagine, P»Proline, Q»Glutamine, R^Arginine, S=Serine, 
T°Threonlne, V=»VaIine, W^Tryptophao, Y^Tyrosine, 
X^Unknown, *=<Stop cod on, /"possible nucleotide deletion, 
^possible nucleotide insertion 










EHPPAMNHLDSSSNPSSTTFSTPSSPAPFPTSSNPS 

SATTPP\hn>SP\GQR\DSRFNFPSOArFIHHR\Q\QFI 

FPDISAFAHAAPLPEAADGTRLDDQPKADVLEAH 

EAEAEEPEAGKSEAEDDEDEVDDLPSSRRPWRG 

PISRKASQTSVYLQEWDIPFEQVELGEPIGQGRW 

GRVHRGRWHGEVAIRLLEMDGHNQDHLKLFKK 

EVMNYRQTRHEhTS^LFMGACMNPPHLAIITSFC 

KGRTLHSFVRDPKTSLDINKTRQIAQEIDCGMGYL 

HAKGIVHKDLKSRNVFYDNGVKVVITDFGLFXGIS 

GVWXEGRRENQLKI^HDWLCVTJ^PEIVREMTPG 

KDEDQLPFSKAADVYAFGTVWYELQARDWPLK 

NQAAEASIWQIGSGEGMKRVLTSVSLGKEVSEN 

LSACWAFDLQERPSVFSLLMDMLEKLPKLNRRLS 

HPGHF*KSADINSSKWPRFERFGLGVLESSNPK 

M [ 


3762 


A 


2 


1578 


MAHYITFLCNfVLVLLLQNSVLAEDGEVRSSCRT 

APTDLVFILDGSYSVGPENFEIVKKWLVNITKNF 

DIGPKFIQVGWQYSDYPVLEIPLGSYDSGEHLTA 

AVESILYLGGNTKTGKAIQFALDYIFAKSSRFLT 

KIAWLTDGKSQDDVKDAAQAARDSKJTLFATG 

VGSETEDAELRA1ANKPSSTYVFYVEDYIAISKIR 

EVMKQKLCEESVCPTR1PVAARDERGFDDLLGLD 

VNKJCVKKRIQLSPKKIKGYEVTSKVDLSELTSNV 

FPEGLPPSYVFVSTQRFKVKK1WD3LWRILTIDG/* 

PQIAVTLNGVDKILLFTTTSVINGSQVVTFANPQV 

KTLFDEGWHQIRLLVTEQDVTLYIDDQQIENKPL 

HPVLGILINGQTQIGKYSGKEETVQFDVQKLRfY 

CDPEQNNRETACEIPGFCLNGPSDVGSTPAPCICP 

PGKPGLQGPKGDPGLPGNPGYPGQPGQDGKPVS 

TESLVISGISGITGYQGIAGTPGVPGSPGIQGARGL 

PGYKGEPGRDGDK 


3763 


A 


3 

* 


1267 


CKVWRNPLNLFRGAEYNRYTWVTGREPLTYYD 

MNLSAQDHQTFFTCDSDHLRPADAIMQKAWRE 

RNPQARlSAAHEALEIhnECATAYIlXAEEEATTIA 

EAEKLFKQALKAGDGCYRRSQQLQHHGSQYEA 

QHSVLYLPLQVTRHQCLGVHQKKASNVCQKTRE 

DQGSSENDERFNEGVPPSEYVQYP*KPFVKALLEL 

QAYADVQAVLAKYDDISLPKSATICYTAALLKA 

RAVSDKFSPEAASRRGLSTAEMNAVEAIHRAVEF 

NPHVPKYLLEMKSLILPPBHILKRGDSEAIAYAFF 

HLAHWKRVEGALNLLHCTWEGTFRMIPYPLEKG 

HLFYPYPICTETADRELLPSFHEVSVYPKKELPFFI 

LFTAGLCSFTAMLALLTHQFPELMGVFAKAVSV 

CLEGGLGEWMGKAKGIKAA 


3764 


A 


25 


1032 


RS AD GLC GNKDRERGNEFTRNQQA AQEV VNPK 

KKMKKKKYVNSGTVTLLSFAVESECTFLDYTKG 

GTQINFTVAEDFTASNGNPSQSTSLHYMSPYQLN 

AYALALTAVGEUQHYDSDKMFPALGFGAKLPPD 

GRVSHEFPLNGNQENPSCCGIDGILEAYHRSLRT 

VQLYGPTNFAPVVTHVARNAAAVQDGSQYSVL 

LnTDGVISDMAQTKEAIWGNSKIJPMSinVGVGQ 

AEFNAMVELDGDDVRISSRGKLAERDIVQFVPFR 

DYVDRTGNHVl^MARLARDVLAEIPDQLVSYM 

KAQGIRPRSPPAAPTHSPSQSPARTPPACPLHTHI 


3765 


A 


172 


3456 


LGMMDSPKIGNGLPVIGPGTDIGISSLHMVGYLG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 
1 peptide 
( sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A°>Aianine C=€ysteine, D»Aspartic Add, 
E=Glutamic Add, F=Pheny lain nine, G=Glycine, H-Hislidine, 
I=I$oleudne, K=Lysine» L»Leudne, M^Methionine, 
N=Asparagine, P»Proline, Q=Glutaminc, R^Arginine, S=Serine, 
T=ThrconJnc, V»Valine, W~Tryptopbao, Y^Tyrosine, 
X«=*Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nudeotide insertion 

• 






**• 


• 


KNFDSAKVPSDEYCPACKEKGKLKALKTYRISFQ 

ESIFLCEDLQCIYPLGSKSLNNLISPDLEECHTPHK 

PQKRKSLESSYKDSLLLANSKKTKNY1AIDGGKV 

LNSKHNGEVYDETSSNLPDSSGQQNPIRTADSLE 

RNEILEADTVDMATTKJDPATVDVSGTGRPSPQN 

EGCTSKLEMPLESKCTSFPQALCVQWKNAYALC 

WLDdLSALVHSEELKNTVTGLCSKEESIFWRLL 

TKYNQANTLLYTSQLSGVKDGDCKKLTSEIFAEI 

ETCLNEVRDEIFISLQPQLRCTLGDMESPVFAFPL 

LLKLETHIEKLFLYSFSWDFECSQCGHQYQNRH 

MKSLV'J KI'NVIPEWHPLNAAHFGPCNNCNSKSQI 

RKMYLEKVSPIFMLHFVEG LPQND LQH YAFHFE 

GCLYQITSVIQYRANNHFITWILDADGSWLECDD 

IXGPCSERHKKFEVPASElHmWERKISQVTDKE 

AACLPLKKTNDQHALSNEKPVSLTSCSVGDAAS 

AETASVTHPKDISVAPRTLSQDTAVTHGDHLLSG 

PKGLVDNBLPLTLEETIQKTASVSQLNSEAFLNLEN 

KPVAENTGILKTNTLLSQESLMASSVSAPCNEKLI 

QDQFVDISFPSQVWTNMQSVQLNTBDTVNTKS 

VNNTDATGLIQGVKSVEIEKDAQLKQFLTPKTEQ 

LKPERVTSQVSNLKKKETTADSQTTTSKSLQNQS 

LKENQKKPFVGSWVKGLISRGASFMPLCVSAHN 

RNTITDLQPSVKGVNNFGGFKTXGINQKASHVSK 

KARKSASKPPPISKPPAGPPSSNGTAAHPHAHAA 

SEVLEKSGSTSCGAQLNHSSYGNGISSANHEDLV 

EGQIHKLRLKLRKJKLKAEKiCKJLAALMSSPQSRT 

VRSENLEQVPQDGSPNDCES1EDLLNELPYPIDIA 

NESACTTVPGVSLYSSQTHEEILAELLSPTPVSTE 

LSENGEGDFRYLGMGDSH1PPPVPSEFNDVSQNT 

HLRQDHNYCSPTKKNPCEVQPDSLTNNACVRTL 

NLESPMKTDIFDEFFSS S ALNAL ANDTLDLPHFDE 

YLFENY 


3766 


A 


3 

m 


1622 


AQQIVYRhTV^^EKV'ICNLVSLGYQLTKJPDVlIJU* 

EKGEEPWL VERE3HQETHPDSETAFEIKSS V SSRSI 

FKDKQSCDKMEGMARNDLWYLSLEEVWKCRD 

QLDKYQENPERHLRQ VAFTQKKVLTQERV SESG 

KYGGNCLLPAQLVLREYFHKRDSHTKSLKHDLV 

LNGHQDSCASNSNECGQTFCQNIHLIQFARTHTG 

DKSYKCPDNDNSLTHGSSLGISKGIHREKPYECK 

ECGKFFSWRSNLTRHQLIHTGEKPYECKECGKSF 

SRSSHLIGHQKTHTGEEPYECKECGKSFSWFSHL 

VTHQRTHTGDKLYTCNQCGKSFAHSSRLIRHQR 

THTOEKPYECPECGKSFRQSTHLILHQRTHVRVR 

PYECNECGKSYSQRSHLVVHHRIHTGLKPFECKD 

CGKCFSRSSHLYSHQRTHTGEKPYECHDCGKSFS 

QSSALIVHQRJHTGEKPYECCQCGKAFIRKNDLIK 

HQRIHVGEETYKCNQCGIIFSQNSPFIVHQIAHTG 

EQFLTCNQCGTALVNTSNLIGYQTNHERENAY 


[3767 


A 


3 


1622 


AQQ1VYRKVMLEKYKNLVSIX3YQLTKPDVILRL 

EKGEEPWL VEREIHQETHPDSETAFEIKSSVSSRSI 

FKDKQSCDIKMEGMARNDLWYLSLEEVWKCRD 

QLDKYQENPERHLRQVAFTQKKVLTQERVSESG 

KYGGNCLLPAQLVLREYFHKRDSHTECSLKHDLV 

LNGHQDSCASNSNECGQTFCQNIHLIQFARTHTG 

DKS YKCPDNDN SLTHG S SLG1SKG IHREKP YECK 
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SEQID 
NO: 


Method 


| Predicted 

beginning 

nucleotide 

location 

corresponding 
1 to first amino 
[ acid residue of 
I peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanlne OCysteine, D=Aspartic Acid, 
E°Glutainlc Add, ^Phenylalanine, G=Glycine» H»Histidine, 
I=IsoJeDcine, K»Lysine, L^Leucine, M=MetbJonine, 
N=Aspnragine, P=ProlIne, Q=G) uu mine, R-Arginine, S^erine, 
T=Threonine, V=Valine, W»Tryptophan, Y=Tyrosine, 
X°Unknown, *«Stop codon, /= possible nucleotide deletion, 
^possible nucleotide insertion 










ECGKFFSWRSNLTRHQLIHTGEKPYECKECGKSF 

SRSSHLIGHQKTHTGEBPYECKECGKSFSWFSHL 

VTHQRTHTGDKLYTCNQCGKSF/VHS SRLIRHQR 

THTGEKPYECPECGKSFRQSTHLIlJiQRTOVRVR 

FYECNECGKSYSQRSHLWHHRIHTGLKPFECKD 

CGKCFSRSSHLYSHQRTHTGEKPYECHDCGKSFS 

QSSAUVHQRIHTGEKPYECCQCGKArTRKNDLIK 

HQRIHVGEETYXCNQCGIIFSQNSPFIVHQ1AHTG 

EQFLTCN QCGTALVNTSNLIG YQTNHIRENA Y 


3768 

• 


A 


185 


2258 


SUIKMSRKISKESKKVNISSSLESEDISLETTVPTD 

DISSSEEREGKVRITRQLIERKELLHNIQLLKIELS 

QKTMMIDNLKVDYLTKIEELEEKLNDALHQKQL 

LTLRLDNQLAFQQKDASKYQELMKQEMETELLR 

QKQLEETNLQLREKAGDVRRSLRDFELTEEQYIK 

LKAFPEDQLSIPEYV S VRFYEL VNPLRKE1CELQ V 

KKNILAEELSTNKKQLKQLTETYEEDRKNYSEV 

QIRCQRLALELADTKQLIQQGDYRQENYDKVKS 

ERDALEQEVIELRRKHEILEASHM1QTKERSELSK 

EVVTLEQTVTLLQKDKEYLNRQNMELSVRCAHE 

EDRLERLQAQI^ESKKAREEMYEKYVASRDHY 

KTEYENKLHDELEQIRLKTNQEIDQLRNASREMY 

EREhflclNLREARDNAVAEKERAVMAEKDALEKH 

DQLLDRYRE\LQ\LSTESKVTEFLHQSKLKSFESE 

RVQLLQEETARNLTQCQLECEKYQKKLEVLTKE 

FYSLQASSEKRJTELQAQNSEHQARJLDIYEKLEK 

ELDEIIMQTAEDBNEDEAERVLFSYGYGANVPTT 

AKRRLKQSVHLARRVLQLEKQNSLI/LKRSGTSK 

GPSNTAFTRSLTEANSLLNQTQQPYRYLIESVRQ 

RDSKIDSLTESIAQUERKDVSNLNKEKSALLQTN 

GDCMAL\DL\DQLLNHP 


3769 


A 


3 


2297 


DAAEFRVVADAMKVIGFKPEEIQTVYKJLAAILH 

LGNLKFWDGDTPLIENGKVVSnAELLSTKTDM 

VEKALLYRTVATGRDUDKQHTEQEASYGRDAF 

AKAIYERLFCWIVTRIMDIIEVKNYDTTfflGKNTV 

IGVLDIYGFEIFDNNSFEQFCINYCNEKLQQLFIQL 

VLKQEQEEYQREGIPWKHIDYFNNQIIVDLVEQQ 

HKGIIAI1JDDACMNVGKVTOEMFLEALNSKLGK 

HAHFSSRKLCASDKILEFDRDFRIRHYAGDVVYS 

VIGFmKNKDTLFQDFKRLMYNS SNP VLKNMWP 

EGKLSITEVTKRPLTAATLFKKSMIALVDNLASK 

EPYYVRCDCPNDKKSPQIFDDERCRHQVEYLGLL 

ENVRVRRAGFAFRQTYEKFLHRYKMISEFTWPN 

HDIJPSDKEAVKKilERCGFQDDVAYGKTKIFIRT 

PRTLFTLEELRAQMLDUVLFLQKVWRGTLARMR 

YKRTKAALTHRYYRRYKVKSYIHEVARRFHGVK 

TMRD YGKHVK WP SPPKVLRRFEE ALQTIFNR WR 

ASQLIKSIPASDLPQVRAKVAAVEMLKGQRADL 

GLQRAWEGNYLASKPDTPQTSGTFVPVANELKR 

KDKYMNVLFSCHVRKVNRFSKVEDRAJFVTDRH 

LYKMDPTKQYKVMKTIPLYNLTGLSVSNGKDQL 

VVFOTKDhfKPLIVCLFSKQPTHESRIGELWGVLV 

NHFKSEKJOILQVVNVTNPVQCSLHGKKCTVSVE 

TRLNQPQPDFTKNRSGFILSVPGN 


3770 


A 


3 


6276 


" HKVAAPDVVVPTLDTVRHEAIXYTWLAEHKPL 
VLCGPPGSGKTMTLFSALRAJLPDMEWGLNFSS 
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SEQID | Method 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 



Amino add sequence (A-Alanine OCysteine, D=Aspartic Acid, 
E-Glutamic Add, F-Phenylalanine, OGIydne, H=Histidine, 
I et IsoIcudne t K»L>ysine, L^Leudne, M B Metbionine, 
N-Asparagine, F-Proline, Q-Giutamine, R^Argininc, S^Sertne, 
TVThreonine, V-Vallne, W-Tryptophan, Y-Tyrosine, 
X°Un known, *«Stop cod on, /-possible nudeotide deletion, 
V»nossibte nucleotide insertion 



ATTPELLLKTFDHYCEYRRTPNGVVLAPVQLGK 

WLVLFCDEINLPDMDKYGTQRVISFIRQMVEHG 

GFYRTSDQTWVKLERIQFVGACNPPTDPGRKPLS 

HRFLRHWVVYVDYPGPASLTQIYGTFNRAMLR 

LIPSLRTYAEPLTAAMVEFYTMSQERFTQDTQPH 

YIYSPREMTRWVRGIFEALRPLETLPVEGLIRIWA 

HEALRLFQDRLVEDEERRWTDENIDTVALKHFP 

NIDREKAMSRPILYSNWLSKDYIPVDQEELRDYV 

KARLKVFYEEELDVPLVLFNEVLDHVLR1DRIFR 

QPQGHLLLIGVSGAGKTTLSRFVAWMNGLSVYQ 

IKVHRKYTGEDFDEDLRTVLRRSGCKNEK1AFIM 

DESNVLDSGFLERMNTLLANGEVPGLFEGDEYA 

TLMTQCKEGAQKEGLMLDSHEELYKWFTSQVIR 

NLHVVFTMNPSSEGLKDRAATSPALFNRCVLNW 

FGDWSTEALYQVGKEFTSKMDLEKPNYIVPDYM 

PWYDKLPQPPSHREAJVNSCVFVHQTLHQANA 

RLAKRGGRTMAITPRHYLDFINHYAl^FHEKRSE 

LEEQQMHLNVGLRKJKETVDQVEELRRDLRIKS 

QELEVKNAAANDKLKKMVKDQQEAEKKKVMS 

QEIQEQLHKQQEVIADKQMSVKEDLDKVEPAVI 

EAQNAVKSIKKQHLVEVRSMANPPAAVKLALES 

ICLLLGESTTDWKQIRSIIMRENFIPTIVNFSAEEIS 

DArREKMKK^^mSNPSY^^re^VNRASI^CGP^ 

KWAIAQLNYADMLKRVEPLKNELQKLEDDAKD 

NQQKANEVEQMIRJDLEASIARYKEEYAVLISEAQ 

AIKADLAAVEAKVNRSTALLKSLSAERERWEKT 

SETFKNQMSTIAGDCLLSAAFIAYAGYFDQQMR 

QNLFTTWSHHLQQ ANIQFRTDIARTE YL SN ADER 

LRWQASSLPADDLCTENAIMLKRFNRYPLnDPS 

GQATEFIMNEYKDRKITRTSFLDDAFRKNLESAL 

RFGNPLLVQDVESYDPVLNPVLNREVRRTGGRV 

LITLGDQDBDLSPSFVIFLSTRDPTVEFPPDLCSRV 

TFVNFTVTRSSLQSQCLNEVLKAERPDVDEKRSD 

LLKLQGEFQLRLRQLEKSLLQALNEVKGRILDDD 

TnTTLENLKREAAEVTRKVEETDIVMQEVETVS 

QQYLPLSTACSSIYFTMESLKQIHFLYQYSLQFFL 

DIYHhr^YEOTNLKGVTDmQRLSUTKDLFQVA 

FNRVARGMLHQDHITFAMLLAJUKLKGTVGEPT 

YD AEFQHFLRGNEIVLS AG STPRIQ GLTVEQ AE A 

WRLSCLPAFKDLLAKVQADEQFGIWLDSSSPEQ 

TVPYLWSEETPATPIGQAIHRLLLIQAFRPDRLLA 

MAHMFVSTNLGESFMSIMEQPLDLTQrVGTCVKP 

mPVLMCSVPGYDASGHVEDLAAEQNTQITSIAI 

G S AEGFNQADKAINTA VKS GRWVMLKNVHL AP 

GWLMQLEKKLHSLQPHACFRLFLTMEINPKVPV 

N1XRAGRIFVFEPPPGVKANMLRTFSSIPVSRICK 

SPNERARLYFLLAWFHAHQERLRYAPLGWSKKY 

EFGESDLRSACDTVDTWLDDTAKGRQNISPDKIP 

WSALKTLMAQSIYGGRVDNEFDQRLLNTFLERL 

FTTRSFDSEFKLACKVDGHKPIQMPDGIRREEFV 

QWVELLPDTQTPSWLGLPNNAERVLLTTQGVD 

MISKMLKMQMLEDEDDLAYAETEKKTRTDSTS 

DGRmWMRTLHTTASNWLHLIPQTLSHLKRTVE 

NDCPPLFRFFE\REVKMGAKLLQ\DVRQDLADV\V 

QVCEGKKXQThmJITUVNELVVKGILPVRSWSHY 
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Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
add residue of 
peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=AJanine OCystcine, D=>Aspartic Acid, 
E=Glutamic Add, F=PbenylaJanine, G=GJydne, H«Histidine, 
I=Isoleucine, K=Lysine, L»Leucine, M^Methionine, 
N-Asparagine, P^Prollne, Q^Glutamine, R=Arginine, S=Serine f 
^Threonine, V«Valine, W=Tryptophan, Y-Tyrosine, 
X~Unkno\vn, *-Stop codon, /= possible nucleotide deletion, 
V>possib!c nucleotide insertion 



TVPAG\MTVIQWGVPISARRI\KQLQNISL\AAASG 

GAKELKNIHVCLG GLFVPEAYTTATRQ YVAQ AN 

SWSLEELCLEVKNTTTSQGATLDACSFGVTGLKL 

QGATOWNKLSLSNAISTALPLTQLRWVKQTNT 

EKKASVVTLPVYLNFTRADLIFrVDFEIATKEDPR 

SFYERGVAVLCTE 



2043 



LPLLHAGFNRRFMENSSnACYNELIQIEHGEVRS 

QFKLRACN S VFTALDHCHEAIEITSDDHVIQYVN 

PAFERMMGYHKGELLGKELADLPKSDKNRADL 

LDTINTCIKKGKEWQGVYYARRKSGDSIQQHVKI 

TPVIGQGGKIRHr^SLKKLCCTTDNNKQIHKIHR 

DSGDNSQTEPH SFRYKNRRKESIDVKSISSRGSD A 

PSLQNRRYPSMARfflSMTIEAPITKVINIINAAQEN 

SPVTVAEALDRVLEILRTTELYSPQLGTKDEDPH 

TSDLVGGLMTDGLRRI^GNEYVFTKNVHQSHSH 

LAMPITINDVPPCISQLLDNEESWDFNIFELEAITH 

KRPLVYLGLKVFSRFGVCEFLNCSETTLRAWFQ 

VIEANYHS SNA YHNSTHAADVLHATAFFLGKER 

VKGSLDQLDEVAALIAATVHDVDHPGRTNSFLVC 

NAGSELAVLYNDT\AV\LESHHTALAFQ\LTVKDT 

K\CNXFKNm/RGNHYRTLRQAimMVLATEMTKH 

FEHVNKFVNSINKPMAAEIEGSDCECNPAGKNFP 

ENQILrKRMMIKCADVANPCRPLDLCIEWAGRIS 

EEYFAQTDEEKRQGLPVVMPVFDRNTCSIPKSQI 

SF1DYFITDMFDAWDAFAHLPALMQHLADNYKH 

WKTLDDLKCKSLRLPSDRLKPSHRGGLLTDKGH 

CESQ 



1013 



50 



TLVHADGFPSLHITETCLAYREKRIGIDLVHDTVE 

HELIKEAEDQGIMALLTRTLEEASEQIRMNRSAK 

YNLEKDLKDKFVALTIDDICFSLNNNSPNIRYSEN 

AVRIEPNSVSLEDWLDFSSTNVEKADKQRNNSL 

MLKAL VDXRILS QTANYLRKQCD WHTAFKNGL 

KDTKDARDQLADHLAKWMEEIASQEKNITALEK 

AILDQEGPAKVAHTRLETRTHRPNVELCRDVAQ 

YRLMKEVQEITHNVARLKETLAVQAQAELKGLH 

RRQLALQEEIQVKENTIYIDEVLCMQMRKSIPLR 

DGEDHGVWAGGLRPDAVC 



1 



955 



AAARESERQLRLRLCVLNED LGTE RDYVGTLRFL 

QSAFLHRIRQNVADSVEKGLTEENVKVLFSNEBDI 

LEVHKDFLAALEYCLHPEPQSQHELGNVFLKFK 

DKFCVYEEYCSNHEKALRLLVELNKIPTVRAFLL 

SCMLLGGRKTTDIPLEGYLVLSPIQRICKYPLLLKE 

LAKRTPGKHPDHPAVQ\SALQAMKTVCSN1NETK 

RQMEKLEALEAAA/QSHffiGWEGSNLTDICTQLL 

LQGTLLKISAGNIQERAFFLFDNLLVYCKRKSRV 

TGSKKSTKRTKSINGSLY1FRGRINTEVMEVENVE 

DGTGSPSPSLA 



4254 



2061 



ELQGDFSVPDVPKSMAWCENSICVGFKRDYYLI 

RVDGKGSDCELFPTGKQLEPLVAPLADGKVAVG 

QDDLTWLNEEGICTQKCALNWTDIPVAMEHQP 

PYIIAVLPRYVEIRTFEPRLLVQSIELQRPRFITSGG 

SNIIYVASNHFVWRLIPVPMATQIQQLLQDKQEE 

LALQLAEMKDDSDSEKQQQIHHIKNLYAFNLFC 

QKRFDESMQVFAKLGTDPTHVMGLYPDLLPTDY 

RKQLQYPNPLPVLSGAELEKAHLALIPYLTQKRS 
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SEQID 
NO: 



3775 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue or 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to fast amino 
acid residue of 
peptide 
sequence 



1832 



839 



Amino add sequence (A^AIaninc OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G-Glycine, H=Histidine, 
Mfeoleudne, K»Lystne, L^Lcocine, MoMethionine, 
N**Asparaginc, PHProline, Q=Glutaraine, R-Arginine, S^Serine, 
T=Threonine, V=VaIine, W«Tryptophan, Y«*Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
possible nucleotide insertion 



QLVKKU^SDHQSSTSPLMEGTPTlKSlOCKLLQn 

DTT1JLKCYLHTNVALVAPLLRLENNHCHIEESEH 

VLKKAHKYSELIILYEKKGLHEKALQVLVDQSK 

KANSPLKGHERTVQYLQHLGTENLHLIFSYSVW 

VLRDFPEDGIJCIFTEDLPEVESLPRDRVLGFLIEN 

FKGL AIP YLEHDHVWEETG SRFHNCLIQL YCEK V 

QGLMKEYLLSFPAGKTPVPAGEEEGELGEYRQK 

LLMFLEISSYYDPGRLICDFPFDGLLEERALLLGR 

MGKHEQALFrYVHILKJDTRMAEEYCHKHYDRN 

KIX3NKDVYLSLUUvm,SPPSIHCLGPlKLELLEPK 

ANLQAALQVLELHHSKJLDTTKALNLLPANTQIN 

DIRIFLEK VLEEN A QKKRJWQ VLKNLLHAEFLR V\ 

QEERILHQQVKCI1TEEKVCMVCKKKIGNSAFAR 

YPNGVWHYFCSVKEVNPADT 



MSRARGALCRACLALAAALAALLLLPLPLPRAP 

APARTPAPAPRAPPSRPAAPSLRPDDVFIAVKTTR 

KNHGPRLRLLLRTWXISRARQQTTTFTDGDDPELE 

LQGGDRVINTNCSAVRTRQALCCKMSVEYDKFI 

ESGRKWFCHVDDDNYVNARSLLHLLSSFSPSQD 

VYLGRPSLDHPIEATERVQGGRTVTTVKFWFAT 

GGAGFCLSRGLALKMSPWASLGSFMSTAEQVRL 

PDDCTVGYIVEGLLGARLLHSPLFHSHLENLQRL 

PPDTLLQQVTLSHGGPENPQNWNVAGGFSLHQ 

DPTRFKSIHCLLYPDTDWCPRQKQGAPTSR 



3776 



796 



3777 



3778 



3779 



PRAKLGTRARNMAGQDAGCGRGGDDYSEDEGD 

SSVSRAAVEVFGKLKDLNCPFLEGLYITEPKTIQE 

LLCSPSEYRLEILEWMCTRVWPSLQDRFSSLKGV 

PTEVK1QEMTKLGHELMLCAPDDQELLKGCACA 

QKQLHFMDQLLDTIRSLTIGCSSCSSLMEHFEDT 

REKNEALLGELFSSPHLQMLLNPECDPWPLDMQ 

PLLNKQSDDWQWASASAKSEEEEKLAELARQLQ 

ESAAKLHALRTEYPAQHEQGAAAGAAXTSAP 



413 



SEEDVIEGKTAVIEKRRKKRSSAGVVED/IGGEVQ 

N\flwEGVGVDlNKALLAKRKRLEMYTKLASLRTSN 

QKIEHVWKTQQDQRQKLNQEYSQQFLTLFQQW 

DLDMQKAEEQEEKILVGIMIRFIINQVSSRNGQPS 

LLL 



132 



788 



934 



3780 



2535 



SRLPPPPPHLADGRAGARVPRSARLSRWWVQD 

WTHGPIVRPPAAARmWVNPEEVLLANALWITE 

RANPYFILQRRKGHAGDGGGGGGLAGLLVGTLD 

VVLDSSARVAPYRILYQTPDSLVYWTIACGVGSR 

KEITEHWEWLEQhn^LQTLSIFENEKDnTFVRGKI 

QGIIAEYNKINDVKEDDDTEKFKEAIVKFHRLFG 

MPEEEKLVNYYSCSYWKG 



CKSCTLFPQNPNLPPPSTRERPPGCKTVFVGGLPE 

NATEEIIQEVFEQCGDITAIRKSKKNFCHIRFAEEF 

MVDKAIYl^GYRMRLGSSTDKKDSGRLHVDFA 

QARDDFYEWECKQRMRAREERHRRKLEEDRLR 

PPSPPAIMHYSEHEAALLAEKJLKDDSKFSEAMVQ 

VLLSWIERGEVNRRXSANQFYSMVQSANSHVRRL 

MNEKATHEQEMEEAKENFKNALTGBUTQFEQIV 

AVFNASTRQKAWDHFSKAQRKNIDIWAK\HSEE 

LRNAQSEQLMGIRREEEMEMSDDENCDSPTKXM 

RVDESALGAP 

AAQAEREELAAGRMPGGGPQGAPAAAGGGGVS"" 
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"5Dqu> I M et hoJ 



NO: 



Predicted 
beginning 
nucleotide 

location 
corresponding 

to first amino 
acid residue of 
peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 



3781 



995 



Amino acid sequence (A= Alanine OCysteine, D=Aspartk Acid, 
E^Glutamic Add, ^Phenylalanine, Glycine, H=HSstidine, 
J^Isoleucine, K=Lysine, l/=>Lcucine, M«Methionine, 
N-Asparngine, P»ProUne, Q=Glutamine, R=Arglnlne, S=Serlne, 
T^Threonlne, V-Vallne, W-Tryptophan, Y-Tyrosine, 
X=Unknown, *«Stop codon, ^possible nucleotide deletion, 
V*possible nucleotide insertion 



HRAGSRDCLPPAACFRRRKLARRPGYMRSSTGP 

GIGFLSPAVGTLFRFPGGVSGEESHHSESRARQC 

GLDSRGLLVRSPVSKSAAAPTVTSVRGTSAHFGI 

QLRGGTRLPDRLSWPCGPGSAGWQQEFAAMDS 

SETLDASWEAACSDGARRVRAAGSLPSAELSSNS 

CSPGCGPEVPPTPPGSHSAFTSSFSFIRLSLGSAGE 

RGEAEGCPPSREAESHCQSPQEMGAKAASLDGP 

HEDPRCLSQPFSLLATRVSADLAQAARNSSRPER 

DMHSLPDMDPGSSSSLDPSLAG CGGDGSS GSGD 

AHSWDTLLRKWEPVLRDCLLRNRRQMEVISLRL 

KLQKLQEDAVENDDYDKAETLQQRLEDLEQEKI 

SLHFQLPSRQPALSSFLGHLAAQVQAALRRGATQ 

QASGDDTHTPLRMEPRLLEPTAQDSLHVSITRRD 

WLLQEKQQLQKEIEALQARMFVLEAKDQQLRRE 

IEEQEQQLQWQGCDLTPLVGQLSLGQLQEVSKA 

LQDTLASAGQIPFHAEPPETIRSLQERIKSLNLSLK 

EITTKVCMSEKJCSTLRKKVNDrETQLPALLEAK 

MHAISGNHFWTAKDLTEEIRSLTSDREGLEGLLS 

KLLVLSSRNVKKLGSVKEDYNRLRREVEHQETA 

YETSVKEOTMKYMETLKNKLCSCKCPLLGKVW 

EADLEACRLLIQCLQLQEARGSLSVEDERQMDD 

LEGAAPPIPPRLHSEDKRKTPLKESYILSAELGEK 

CEDIGKKLLYLEDQLHTAIHSHDEDLIQSLRRELQ 

MVKETLQAMILQLQPAKEAGEREAAASCMTAG 

VHEAQA 



GRRRAGP AHS ARMYNMMETELKPPGPQQTS GG 

GGGNSTAAAAGGNQKNSPDRVKRPMNAFMVW 

SRGQRRKMAQENPKMHNSEISKRLGAEWKLLSE 

TEKRPFEDEAKRLRALHMKEHPDYKYRPRRK'IK 

TLMKKDKYTLPGGLLAPGGNSMASGVGVGAGL 

GAGVNQRMDSYAHMNGWSNGSYSMMQDQLG 

YPQHPGLNAHGAAQMQPMHRYDVSALQYNSM 

TSSQTYMNG/SRPTYSMSYSQQGTPGMAPGS\MG 

SVVKSEASSSPPWTSSSHSRAPCQAGDLRDMIS 

MYLPGAEVPEPAAPSRLHMSQHYQSGPVPGTAI 

NGTLPLSHM 



3782 



2649 



FRVPDSCPVVLHSFTQLDPDLPRPESSTQEIGEELI 

NGVIYSISLRKVQLHHGGNKGQRWLGYENESAL 

NLYETCKVRTVKAGTLEKLVEHLVPAFQGSDLS 

YVTIFLCTYRAFTTTQQVLDLLFKRYGRCDALTA 

SSRYGCILPYSDEDGGPQDQLKNAISSILGTWLD 

QYSEDFCQPPDFPCLKQLVAYVQLNMPGSDLER 

RAHLLLAQLEHSEPIEAEPEGEEDWALSPVPALK 

PTPELELALTPARAPSPVPAPAPEPEPAPTPAPGSE 

LEVAPAPAPELQQAPEPAVGLESAPAPALELEPA 

PEQDPAPSQTLELEPAPAPVPSLQPSWPSPWAEN 

GLSEEKPHLLVFPPDLVAEQFTLMDAELFKKVVP 

YHCLGSIWSQRDKKGKEHLAPTIRATVTQFNSV 

ANCVITTCLGNRSTKAPDRARVVEHWIEVAREC 

RILKOTSSLYAILSALQSNSIHRLKKTWEDVSRDS 

FRIFQKLSEIFSDENNYSLSRELLIKEGTSKFATLE 

MNPKRAQKRPKETGnQGTVPYLGTFLTDLVML 

DTAMKDYXYGRLINFEKRRKEFEVIAQIKLLQSA 

CNNYSIAPDEQFGAWFRAVERLSETESYNLSCEL 

EPPSESASNTLRTKKNTAIVKRWSDRQAPSTELS 
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SEQU) 
1 NO: 


1 Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

nucleotide 

location 

corresponding 

to last amino 
I acid residue of 
1 peptide 
| sequence 


Amino acid sequence (A=Alaninc OCysteine, D= As par tic Add, 
E*=Glutamlc Acid, ^Phenylalanine, OGlycine, H=Histidine, 
I=IsoIeurine, KpLysine, L/^Leudne, M-Metbionine, 
N»Asparagiae, P=Proline, Q=Glutamlne, R»Arginine, S^Serinc, 
T^Tfareonine, V»Vallne, W-Tryptophan, Y-Tyrosine, 
X=Unknowo ? *~Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 


• 








TSGSSHSKSCDQLRCGPYLSSGDIADALSVHSAG 

SSSSDVEEINISFVPESPDGQEKKFWESASQSSPET 

SGISSASSSTSSSSASTTPVAATRTHKRSVSGLCNS 

SSALPLYNQQVGDCCIIRVSLDVDNGNMYKSILV 

TSQDKAPAVIRKAMDKHNLEEEEPEDYELLQILS 

DDRKLKIPEN ANVFY AMN STAN YDF VLKKRTFT 

KGVKVKHGASSTLPRMKQKGLKIAKGIF 


3783 


A 


3 


869 


RSGQGKYYGLIGRRRFQQMDVLEGLNLLITISGK 

RNKLRVYYLSWIJWKILHhn)PEVEKKQGWTTV 

GDMEGCGHYRVVKYER1KFLVIALKSSVEVYAW 

APKPYHKFMAFKSFADLPHRPLLVDLTVEEGQR 

LKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQIT 

PHAnFLPKTOGMEMLLCYEDEGVYVNTYGRirK 

DWLQWGEMFJ-SVAYICSNQIMGWGEKAIEIRS 

VETGHLDGVFMHKRAQRLKFLCERNDKVFFASV 

RSGGSSQVYFMTLNRNCIMNW 


3784 


A 


1213 


457 


LSPRQVDGLAGLQKGLSLSLLYQFLMNGIRLGTY 

GLAEAGGYLHTAEGTHSPARSAAAGAMAGVMG 

AYLGSPIYMVKTHLQAQAASEIAVGHQYKHQG 

MFQALTEIGQKHGLVGLWRGALGGLPRVrVGSS 

TQLCITSSTKDLLSQWEIFPPQSWKLALVAAMM 

SGIAVVLAMAPFDVACTRLYNQPHRCTGQGPNLY 

RGILDALLQTARTEGIFGMYKGIG A S YFRLGPHTI 

LSLKFWDQLRSLYYTOTK 


3785 


A 


193 


813 


RRRGRHSLCGGKMLAYCVQDATVVDVEKRRNP 

SKHYVYIINVTWSDSTSQTIYRRYXSKFFDLQMQL 

LD\KFPI\ESGQKDPKQRIIPFLPGKJLFRRSHIRDV 

AVKRLKPEDEYCRALVRLPPHISQCDEVFRFFEAR 

PEDVNPPKEQGPSPPDAVLPYGVNKGKQELKAG 

PNWPGRTHHVVNCVTQKCLFVFHFXFSSSG>fKE 

SKSL 


3786 


A 


3785 


1632 

• 

■ 


EFVGRAASTTVVTRIAWRMADAGIRRVVPSDLY 

PLVLGFLRDNQLSEVANKFAKATGATQQDANAS 

SLLDIYSFWLNRSAKVPERKLQANGPVAKKAKK 

KAS SSDSEDSSEEEEE VQGPPAKKA AVPAKRVGL ! 

PPGKAAAKASESSSSEESSDDDDEEDQKKQPVQ 

KGVKPQAKAGQAPPKKAKSSDSOSDSSSEDEPP 

KNQKPKITP\VTVKAQTKAPPKPARA\APKIANGK 

AASSSSSSSSSSSSDDSEEEKAAATPKKTVPKKQV 

VAKAPVKAATTPTRKSSSSEDSSSDEEEEQKKPM 

KNKPGPYS SVPPPS APPPKKSLGTQPPKKA VEKQ 

QPVESSEDSSDESDSSSEEEKKPPTKAVVSKATTK 

PPPAKKAAESSSDSSDSDSSEDDEAPSKPAGTTK 

NSSNKPAVTTKSPAVKPAAAPKQPVGGGQKLLT 

RKADSSSSEEESSSSEEEKTKKMVATTKPKATAK 

AALSLPAKQAPQGSRDSSSDSDSSSSEEEEEKTSK 

SAVKKKPQKVAGGAAPSKPASAKKGKAESSNSS 

SSDDSSEEEEEKLKGKGSPRPQAPKANGTSALTA 

QNGKAAKNSEEEEEEKKKAAVWSKSGSLKKR 

KQNEAAKEAETPQAKKIKLQTPKITPKJOCKGEK 

RASSPFRRVREEEIEVDSRVADNSFDAKRGAAGD 

WGERANQVLKFTKGKSFRHEKTKKKRGSYRGG 

SISVQVNSIKFDSE 


3787 


A 


3 


5078 

• 


IPEG/RALSAEHTSSLVPSLHITTLGQEQAILSGAV 
PASPSTGTADFPSILTF1.QPTENHASPSPWEMPTL 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«=AIanlne C -Cysteine, D=Aspartic Acid, 
E»Glutamlc Acid, ^Phenylalanine, (XJlycine, H=H Is tiding 
I=»l3oIcucinc, K^Lysine, L-Leudne, M-McthioniDe, 
N^Asparaginc, P=Proline, Q=GIutaminc, R-Arginine, S=Serine, 
T«Threonine, V-Valine, W»Tryptophan, Y-Tyrosine, 
X^Unknown, *«Stop codoo, A»possible nucleotide deletion, 
^possible nncleotide insertion 




* 


• 




PAEGSDGSPPATRDLLLSSKVPNLLSTSWTFPRW 

KKDSVTAE.GKNEEAmOTPLQAn>RKEVLSLHT 

VNGFVSDFSTGSVSSPIITAPRTNPLPSGPPLPSILS 

IQATQTVFPSLLAFSSTKPEVYAAAVDHSGLPAS 

APKQVRASPSSMDVYDSLTIGDMKKPATTDVFW 

SSLSAETGSLSTESIISGLQQQTNYDLNGHT1STTS 

WETHLAPTAPPNGLTSAADAIKSQDFKDTAGHS 

VTAEGFSIQDLVLGTSIEQPVQQSDMTMVGSHID 

LWPTSNNNHSRDFQTAEVAYYSPTTRHSVSHPQ 

LQLPNQPAHPLLLTSPGPTSTGSLQEMLSDGTDT 

GSEISSDINSSPERNASTPFQNILGYHSAAESSISTS 

VFPRTS SRVLRASQHPKK WTADTVSSKVQPTAA 

AAVTLFLRKSSPPALSAALVAKGTSSSPLA V AS G 

PAKSSSMTTLAK3m>nCAASGPKRTPGAVHTAF 

PFTPTYMYARTGHTTSTHTA/IARKHGHCLWPW 

YNLP/PP/GKPQAMHTGLPNPTNLEMPRASTPRPL 

TVTAALTSITASVKATRLPPLRAENTDAVLPAAS 

AAVVTTGKMASNLECQMSSKLLVKTVLFLTQRR 

VQISESLKFSIAKGLTQALRKAFHQNDVSAHVDI 

LEYSHNVTVGYYATKGKLVYLPAWIEMLGVY 

GVSNVTADLKQHTPHLQSVAVLASPWNPQPAG 

YFQLKTVLQFVSQADNIQSCKFAQTMEQRLQKA 

FQDAERKVUOTKSNLTIQIVSTSNASQAVTLVYV 

VGNQSTFLNGTVASSLLSQLSAELVGFYLTYPPL 

TIAEPLEYPNLDISETTRDYWVITVLQGVDNSLV 

GLHNQSFARVMEQRJLAQLFMMSQQQGRRFKRA 

TTLGSYTVQMVKMQRVPGPKDPAELTYYTLYN 

GKPLLGTAAAKILSTIDSQRMALTLHHWLLQAD 

PVVKOTPNNLWIIAAVLAPIAVVTVinnTAVLCR 

KNKNDFKPDTMINLPQRAKPVQGFDYAKQHLG 

QQGADEEVIPVTQETVVLPLPIRDAPQERDVAQD 

GSTIKTAKSTETRKSRSPSENGSVISNESGKPSSGR 

RSPQNVMAQQKVTKEEARKRNVPASDEEEGAV 

LFDNSSKVAAEPFDTSSGSVQLIAIKPTALPMVPP 

TSDRSQESSAVLNGEVNKALKQKSDIEHYRNKL 

RLKAKRKGYYDFPAVETSKGLTERKKMYEKAP 

KEMEHVLDPDSELCAPFTESKNRQQMKNSVYRS 

RQSLNSPSPGETEMDLLVTRERPRRGIRNSGYDT 

EPEHEETNIDRVPEPRGYSRSRQVKGHSETSTLSS 

QPSIDEVRQQMHMLLEEAFSLASAGHAGQSRHQ 

EAYGSAQHLPYSEWTSAPGTMTRPRAGVQWVP 

TYRPEMYQYSLPRPA YRFSQLPEMVMG SPPPP VP 

PRTGPVAVASLRRSTSDIGSKTRMAESTGPEPAQ 

LHDSASFTQMSRGPVSVTQLDQSALNYSGNTVP 

AVFAIPAANRPGFTGYFIPTPPSSYRNQAWMSYA 

GENELPSQWADSVPLPGYIEAYPRSRYPQSSPSRL 

PRQYSQPANLHPSLEQAPAPSTAASQQSLAENDP 

SDAPLIWSTAALVKAIREEVAKLAKKQTDMFEF 

QV 


3788 


A 


2 


1737 


MKGLYTOAEMKSDNVKDKDAKJSFLQKAJDVV 

VMVSGEPLLAKPARIVAGHEPERTNELLQEGKC 

CLNKLSSDDAVRRVLAGEKGEVKGRASLTSRSQ 

ELDNKNVREEESRVHKNTEDRGDAEDCERSTSRD 

RKQKEELKEDRMPREKDKDKEKAKENGGNRHR 

EGERERAKARARPDNERQKDRGNRERDRDSERK 
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SEQID 
NO; 


Method 


Predicted 
beginning 
nucleotide 
loco Hon 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»Alan!ne OCysteine, D=Aspartic Acid, 
E^GIutamic Acid, ^Phenylalanine, G=Glyclnc, ENHtsttdlne, 
£=*Isoleuelne, K=*Lysine, LHLeurine, M=Methlonine, 
N=Asparaginc t P=Proline, Q=G)utaminc, R=Arginine, S^Serine, 
T-Threonlne, V^Valine, W-Tryptophan, Y=Tyrosine, 
X^Unknown, *«Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










KETERKSEGGKEKERLRDRDRERDRDKGKDRDR 

RRVKNGEHS WDLDRENNREHDKPEKKSASSGE 

MSKKLSDGTFKDSKAETETEISTRASKSLTTKTS 

KRRSKN SVEGDSTSDAEGDAGPAGQDKSEVPET 

PEIPNELSSNIRRJPRPGSARPAPPRVKRQDSMEAL 

QMDRSGSGKTVSNV1TESHNSDNEEDDQFWEA 

APQLSEMSEIEMVTAVELEEEEKHGGLVKKILET 

KKDYEKLQQSPKPGEKERSLFESAWKKEKDIVS 

KEIEKLRTSIQTLCKSALPLGKIMDYIQEDVDAM 

QNELQM\YHSENRQHAEALQQEQRITDCAVEP\L 

KAELA\ELEQLIKD\Q\QDKJCAVKANILKNEEKIQ 

KMVYSINLTSRR 


3789 


A 

• 


1 

• 


4369 


MRTLGTCLATLAGLLLTAAGETFSGGCLFDEPYS 

TCGYSQSEGDDFhTWEQVNTLTKPTSDPWMPSGS 

FMLVNASGRPEGQRAHLLLPQLKENDTHCEDFH 

YFVSSKSNSPPGIXNVYVKVNNGPLGNPIWNISG 

DPTRTWNRAELAISTFWPNFYQVIFEVITSGHQG 

YLAIDEVKVLGHPCTRTPHFLRIQNVEVNAGQFA 

TFQCSAIGRTVAGDRLWLQGIDVRDAPLKEIKVT 

SSRRF1ASFNVVNTTKRDAGKYRCMARTEGGVGI 

SNYAELWVKEPPVPIAPPQLASVGATYLWIQLN 

ANSINGDGPIVAREVEYCTASGSWNDRQPVDSTS 

YKIGHLDPDTEYEISVLLTRPGEGGTGSPGPALRT 

RTKCADPMRGPRKLEWEVKSRQITIRWEPFGY 

NVTRCHSYNLTVHYCYQVGGQEQVREEVSWDT 

ENSHPQHTITNLSPYTm^SVKLILMNPEGRKESQ 

ELIVQTDEDLPGAVPTESIQGSTFEEKIFLQWREP 

TQTYGVITLYEITYKAVSSFDPEIDLSNQSGRVSK 

LGNETHFLFFGLYPGTTYSFTIRASTAKGFGPPAT 

NQFTTiaSAPSMPAYELETPLNQTDNTVTVMLKP 

AHSRGAPVSVYQIWEEERPRRTKKTTEILKCYP 

VPMFQNASLLNSQYYFAAEFPADSLQAAQPFT1G 

DNKTYNGYWNTPLLPYKSYRJYFQAASRANGET 

KIDCVQVATKGAATPKPVPEPEKQTDHTVKIAG 

VIAGILLFVIIFLGVVLVMKKRKLXAKKRKETMS S 

TRQEIDLWIGELNGPRSYAEQGTKLATRAFSFMD 

THNLNGRSVSSPSSFTMKTNTLSTSVPNSYYPDE 

THTMASDTSSLVQSHTYKKREPADVPYQTGQLH 

PAIRVADLLQHITQMKCAEGYGFKEEYESFFEGQ 

SAPWDSAKKDEKRMKNRYGNIIAYDHSRVRLQT 

IEGDTNSDY1NGNYIDGYHRPNHYIATQGPMQET 

IYDFWRMVWHENTASIINfVTNLVEVGRVKCCK 

YWPDDTEIYKDIKVTLffiTELLAEYVIRTFAVEKR 

GVHEIREIRQFHFTGWPDHGVPYHATGLLGFVR 

QVKSKSPPSAGPLWHCSAGAGRTGCFIVIDIML 

DMAEREGVVDIYNCVRELRSRRVNMVQTEEQY 

VFIHDAILEACLCGDTSVPASQVRSLYYDMNKLD 

PQTNSSQIKEEFRTLNMV1 PTLRVEDCSIALLPRN 

HEKNRCMDILPPDRCLPFLITIDGESSNYINAALM 

DSYKQPSAFIVTQHPLPmVKDFWRLVLDYHCTS 

VVMLNDVDPAQLCPQYWPENGVHRHGPIQVEF 

VSADLEEDESRIFRIYNAARPQDGYRMVQQFQFL 

GWPMYRDTPVSKRSFLKLIRQVDKWQEEYNGG 

EGRTWHCLNGGGRSGTFCAISIVCEMLRHQRTV 

DWHAVKTIJ<N^OCPNMVDLIJDQYKFCYEVALE 
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SEQID 
NO: 



Method 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 



Amino acid sequence (A^Alanlne C«Cysteine, D=Aspartic Acid, 
E«Glutamlc Add, ^Phenylalanine, OClyclot, ENHistidine, 
I 8 Isoleudne, KpLysine, L^JLeudne, M=MetblonLae 1 
N»Asparagine, P=Proline, Q^Glutamlne, R«Arginine, S=S erine, 
T=Threonine, V=»Valine, W=Tryptophan, Y=TyrosIne, 
X=Unbno\vo, *=Stop codon, /= possible nucleotide deletion, 
V=possible nucleotide insertion 



YLNSG 



3790 



261 



485 



EEQTPLHIASRLGKTEIVQLLLQHMAHPDAATTN 
GYTPLHISAREGQV\DV\ASVLLGRQGAAHSFRLT 
KVRRMTS 



3791 



5874 



LPPVTMSGKYIMEEHDSYSDQVWSIDELPSKQG 

YYLQGNYLRCVAEVGSFEHhnLTTDLLNHLVFVQ 

KVFMKEVNEVIQKVSGCjEQPBPLWNEHDGTADG 

DKPKILLYSLNLQFKGIQVTATTPSMRAVRFETG 

LmLELSNKLQTKASPGSSSYLKLFGKCQVDLNL 

ALGQIVKHQVYEEAGSDFHQVAYFKTRIGLRNA 

LREEISGSSDREAVLITLNRPIVYAQPVAFDRAVL 

F^^hTYK\AAYDNWOTQRMALHKJDIHMATKEVV 

DMLPGIQQTSAQAFGTPFLQLTVNDLGICLPITNT 

AQSNHTGDLDTGSALVLTEESTLITACSSESLVSK 

GHFKNFCIRFADGFETSWDDWKPEIHGDLVMNA 

CVVPDGTYEVCSRTTGQAAAESSSAGTWTLNVL 

WKMCGIDVHMDPNIGKRLNALGNTLTTLTGEED 

IDDIADLNSVNIADLSDEDEVDTMSPTIHTEATDY 

RRQAASASQPGELRGRKIMKRIVDIRELNEQAKV 

IDDLKKLGASEGTINQEIQRYQQLESVAVNDIRR 

DVRKKLRRSSMRAASLKDKWGLSYKPSYSRSKS 

ISASGRPPLKRMERA SSRVGETEELPE1RVDAASP 

GPRVTFNIQDTFPEETELDLLSVTIEGPSHYSSNSE 

GSCSVFSSPKTPGGFSPGIPFQTEEGRRDDSLSSTS 

EDSEKBEKDEDHERERFYIYRKPSHTSRKKATGF 

AAVHQLFTERWPTTPVNRSLSGTATERNIDFELD 

IR VEIDSG KC VLHPTTLLQE HDD ISLRRS YDRS SR 

SLDQDSPSKKKKFQTNYASTTHLMTGKKVPSSL 

QTKP SDLETTVFY IPG VD VKLH YNSKTXKTESPN 

ASRGSSLPRTLSKESKLYGMKDSATSPPSPPLPST 

VQSKTNTLLPPQPPPIPAAKGKGSGGVKTAKLYA 

WVALQSLPEEMVISPCLLDFLEKALETIPITPVER 

NYTAVSSQDEDMGHFEEPDPMEES\TTSLVS\SSTS 

AYSSFPVDWVYVRVQPSQIKFSCLPVSRVECML 

KLPSLDLVFSS^GELETLGTTYPAETLSPGGNA 

TQSGTKTSASKTGIPG SSGLGSPLGRSRHSSSQSD 

LTSSSSSSSGLSFTACMSDFSLYVFHPYGAGKQIT 

AVSGLTPGSGGLGNVDEEPTSVTGRKDSLSINLE 

FVKVSLSRIRRSGGASFFESQSVSKSASKMDTTLI 

MSAVCDIGSASFKYDMRXLSEELAFPRAWYRRSI 

ARRLFLGDQTINLPTSGPGTPDSIEGVSQHLSPESS 

RKAYCKTWEQPSQSASFTHMPQSPNVFNEHMTN 

STMSPGTVGQSLKSPASIRSRSVSDSSVPRRDSLS 

KTSTPFNKSNKAASQQGTPWETLVVFAINLKQL 

m^QMNMSNVMGNTTWTTSGLKSQGRLSVGSNR 

DRE1 SMS VGLGRSQLDSKGG WGGTID VNALEM 

VAHISEHPNQQPSHKIQITMGSTEARVDYMGSSIL 

MGIFSNADLKLQDEWKVNLYNTLDSSITDKSEIF 

VHGDLKWDIFQVMISRSTTPDLIKIGMKLQEFFT 

QQFDTSKRALSTWGPVPYLPPKTMTSNLEKSSQE 

QLLDAAHHRHWPGVLKWSGCfflSLFQIPLPEDG 

MQFGGSMSLHGNHMTLACFHGPNFRSKSWALF 

HLEEPNIAFWTEA QKJWEDG SSDHSTYIVQTLDF 

HLGH>n^4VTKPCGALESPMATlTKIlTUU^HENPP 

HGVASVKEWFTsTYVTATRNEELNIXRNVDANNT 
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SEQW 
NO: 



Method 



3792 



3793 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



364 



340 



3794 



3795 



421 



158 



24 



592 



3796 



592 



3797 



1556 



3798 



73 



759 



Amino add sequence (A^AIanine CX^ysteine, D=Asparfic Acid, 
E>=Glutamic Acid, ^Phenylalanine, G=G Jydoe, H-Histidine, 
I=Isoleudne, K^Lysine, L= Leucine, M^Metbionlne, 
N^Asparaglne, P=ProJine, Q=Glutamlne, R^Arginine, S^Serine, 
T-^Tbreonine, V«Va1ine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /-possible nucleotide deletion, 
V=posslble nucleotide Insertion 



ENSTTVKNSSLLSGFRGGSSYNHETETIFALPRM 

QLDFKS1HVQEPQEPSLQDASLKPKVECSWTEF 

TDHICVTMDAELIMFLHDLVSAYLKEKEKAIFPP 

RILSTRPGQKSPIIIHDDNSSDKDREDSITYTTVDW 

RDmC^nWHLEPTLRLISWTGRKIDPVGVDYILQ 

K1XjFHHARTTIPKWLQRGVMDPLX>KVLSVLIKK 

LGTALQDEKEKKGKDKEEH 



QNGSTPLHHAASKNRHEIALMLLEGGANPDGKX) 
HYEATAKHQATAKGNFKMIHILLYYKASTnQDT 
EGNTPPHLVCD\RVEEAKLLVSQGA/SIYIENKEE 
KDP/LQVAKGALGLVLKRMVEG 



DrVPNPKMAPLGDEAPTLEKVLTPELSEEEVSTR 

DDIQFHHFSSEEALQKVKYFVAKEDPSSQEEAHT 

PEAPPPQPPSSERCLGEMKCTLVRGDSSPRQAEL 

KSGPASRPAL 

SYWVGEDYTYKFFEVILIDPFHKAIRKNPDTQWI 
SKAVYKHREMCGLTSTGRKSHGLEKDRMFPHAI 
GGSCRAA*RRRKTLQFPCYH 



GGMDSRVSGTTSNGETKPVYPVMEKKBEDGTLE 

RGHWNNKMEFVL^VAGEnGLGNVWRFPYLCYK 

NGGGAFFJPYLVFLFTCGIPVFLLETALGQYTSQG 

GVTAWRKICPIFEGIGYASQMIVILLNVYYnVLA 

WALFYLFSSFTDDLPWGGCYHEWNTEHCMEFQK 

TNGSLNGTSENATSPVIEFW 



KPASTYSTSQPSMAPLLPIRTLPLILILLALLSPGA 

ADFNISSLSGLLSPALTESLLVALPPCHLTGGNAT 

LMVRRANDSKVVTSSFWPPCRGRRELVSWDS 

GAGFTVTRLSAYQVTm.VPGTKFYISYLVKKGT 

ATESSREIPMFTLPRRNMESIGLGMARTGGMVVI 

TVLLSVAMFLLVLGFUALALGSRK 

ATRLLRGSGSWGCSRLRFGPPAYRRFSSGGAYPN 

IPLSSPLPGVPKPVFATVDGQEKFETKVTTLDNGL 

RVASQNKFGQFCTVGILINSGSRYEAKYLSGIAH 

FLEKLAFSSTARFDSKDEttXTLEKHGGICDCQTS 

RDTTMYAVSADSKGLDTVVALLADVVLQPRLT 

DEEVEMTRMAVQFELEDLNLRPDPEPLLTEMIHE 

AAYRENTVGLHRFCPTENVAKINREVLHSYLRN 

YYTPDRMVLAGVGVEHEHLVDCARKYLLGVQP 

AWGSAEAVDIDRSVAQYTGGIAKLERDMSNVSL 

GPTPIPELTHIMVGLESCSFLEEDFIPFAVLNMMM 

GGGGSFSAGGPGKGMFSRLYLNVLNRHHWMYN 

ATSYHHSYEDTGLLCIHASADPRQVREMVEHTK 

EF1LMGGTVDTVELERAKTQLTSMLMMNLESRP 

VIFEDVGRQVLATRSRKLPHEIXTTLIRNVKPEDV 

KRVASKMLRGKPAVAALGDLTDLPTYEHIQTAL 

SSKDGRLPRTYRLFR 



KRLVEAGVPRTFDGIVGEGGAQSRSCWPWGVTA 

QTPAFSADSLNCLKNCMSITMGSVRPSVEQFHKY 

LP WFLNDRPNIKCPKGGL AA YSTS VNLTSDG QV 

LASRFMAYHKPLKNSQDYTEALRAARELAANIT 

ADLRKWGTOPAJFEVFPYTITNVFYEQYLTILPEG 

LFMLSLCLVPTFAVSCLLLGLDLRSGLLNLLSIV 

MIL VPTVGFMAL WGIS YNA VSLINL VS 

KRLVEAGVPRTFDGIVGEGGAQSRSCWPWGVTA 
QTPAFSADSLNCLXNCMSITMGSVRPSVEQFHKY 



3799 



73 



759 
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SEQ ED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OfCysteine, D=Aspartlc Acid, 
^-Glutamic Add, F^Pbenylalauine, OGlydne, H=Histidine, 
Wsoleudne, K=Lv5ine, L^Leudne, M=>Methionine, 
N=Asparagine, JMProIine, Q=Glutamlne, R«Arginine, S^Serlne, 
T=Threonlne, V-Vallne, W«Tryptophan, Y«TTyroslne, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\= possible nodeotide insertion 










LPWFLNDRPNKCPKGGLAAYSTSVNLTSDGQV 
LASRFMAYHKPLKNSQDYTEALRAAJRJELAANIT 
ADLRKVPGTDPAFE VFP YTITN VFYEQY LTDLPEG 
LFMLSLCLWTFAVSCIJLXGLDLRSGLLNLLSIV 
MILVDTVGFMALWGISYNAVSLINLVS 


3800 


A 


250 


1032 


GIFRSLRVLFPLFSVGRPQFARSLSAAPQLSDTAD 

TMGFGDLKSPAGLQVLNDYLADKSYIEGYVPSQ 

ADVAVFEAVSSPPPADLCHALRWYNHIKSYEKE 

KASLPGVKKALGKYGPADVEDTTGSGATDSKD 

DDDIDLFGSDDEEESEEAKRLREERLAQYESKKA 

KKPALVAKSSILLDVKPWDDETDMAKLEECVRS 

IQADGLVWGSSKLVPVGYGDCKLQIQCVVEDDK 

VGTDMLEEQITAFEDYVQSMDVAAFNKI 


3801 


A 


155 


656 


SREMELVTFRDVAffiFSPEEWKCLDPAQQNLYR 

DVMLENYRNLVSLGFVISNPDLVTCLEQIKEPCN 

IJCIHETAAKPPAICSPFSQDLSPVQGIEDSFHKLIL 

KRYEKCGHENLQLRKGCKRVNECKVQKGVNNG 

VYQCLSTTQSKIFQCNTCVRWSTSSHSNKHK 


3802 


A 

* 


1 


1428 

• 


VTVSPETHMDLTKGCVTFEDIAIYFSQDEWGLLD 

EAQRLLYIJEVMLENFALVASLGCGHGTEDEETP 

SDQNVSVGVSQSKAGSSTQKTQSCEMCVPVLKD 

ILHLADIJGQKPYLVGECTOHHQHQKHHSAKKS 

LKRDMDRASYVKCCLFCMSLKPFRKWEVGKDL 

PAMLRLLRSLVFPGGKKPGTITECGEDIRSQKSH 

YKSGECGKASRHKHTPVYHPRVYTGKKLYECSK 

CGKAFRGKYSLVQHQRVHTGERPWECNECGKF 

FSQTSHLNDHRRWTGERPYECSECGKLFRQNSS 

LVDHQKIHTGARPYECSQCGKSFSQKATLVKHQ 

RVHTGERPYKCGECGNSFSQSAILNQHRRIHTGA 

KPYECGQCGKSFSQKATLIKHQRVHTGERPYKC 

GDCGKSFSQSSILIQHRRIHTGARPYECGQCGKSF 

S QKSGLIQHQV VHTGERPYECNKCGNSFSQCSSL 

HHQKCHNT 


3803 


A 


193 


617 


LFPFLGSESKNGEADSSDKEMKHGQKSPTGKQTS 
QHLKRLKKSGLGHLKWTKAEDIDIETPGSILVNT 
NLRALINKHTFASLPQHFQQYLLLLLPEVDRQMG 
SDGILRLSTSALNNEFFAYAAQGWKQRJLAEGKF 

VFSIIM 


3804 


A 


197 


479 


SSSRASPPEHPSSQAHCGPLVLSHACPEVTNKWS 

TGSSSSPNSSWVSSPLQPEGLSGSSRMKGGSATKI 

LLETL1XAAHMTADQGIASSQRCLL 


3805 


A 


1 


385 


QSADTLFPGDINFNVSGLFSAVTLQDTVSDRLAS 
EELPSTA VPTP ATTPAPAPAPAP ATAPAL V S AAT 
KERTESEVPPRPASPKVTRSPPETAAPVEDMARR 
SELAVGGEEGTEGGRGEGTGSPMSSY 


3806 


A 

* 


47 


1033 


LQGDTWHLSFLSHFSRLHGGVPGRGLLEGNLLQ 

PQAPGHDMTSIPFPGDRLLQVDGV1LCGLTHKQA 

VQCLKGPGQVARLVLERRVPRSTQQCPSANDSM 

GDERTAVSLVTALPGRPSSCVSVTDGPKF*SSN* 

KRIANGLGFSFVQMEKESCSHLKSDLVRIKRLFP 

GHPAEENGAIAAGDIILGREWEGPRKASSSRCRG 

SWAMQLSVQAGPSFASYYPAAVEVLHLLRGAPQ 

EVTLLLCRPPPGALPELEQEWQTPELSADKEFTR 

ATCTDSCTSPILGSRGQLGGTVPPQMQGKAWGL 

RPESSQKAIREGTMGAKTERDLGPVP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A»AlanIne OCysteine, D=Aspartic Add, 
E«=Glutamic Acid, ^Phenylalanine, G=Glycine, H-Histidine, 
l=Isoleurine, K»Lysine, L^Leudne, M^Methionine, 
N^Asparagtne, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
^Threonine, V«Vallne, W=Tryptophan f Y^Tyrosine, 
X-Unknown, *»Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 


3807 


A 


656 


1238 


RCPSLLPPSWPLPTLQTLTRTPGKECAIAGGAGLW 

AVLWGSERTPPYR*GN*NQRGAVPCLRPHRLRP 

QDKPLV1ASDGLWDMLSNEDVVRLVVGHLAEA 

DWHKTDLAQRPANLGLMQSLLLQRKASGLHEA 

DQNAATRLIRHAlGWmYGEMEAERLAAMLTLP 

EDLARMYM)DnVTVVYFNSESIGAYYKGG 


3808 


A 


26 

• 


2195 


SQYSESVAGRQASPERLLGSYHAMASTVEGGDT 

ALLPEFPRGPLDAYRARASFSWKELALFTEGEG 

MLRFKKTEFSALENDPLFARSPGADLSLEKYREL 

hn^RCKRIFEYDFLSVEDMFKSPLKVPALIQCLG 

MYDSSLAAKYLLHSLVFGSAVYSSGSERHLTYIQ 

KIFRMEIFGCFALTELSHGSNTKAIRTTAHYDPAT 

EEFimSPDFEAAKFWVGNMGKTATHAVVFAKL 

CVPGDQCHGLHPITVQIRDPKTLLPMrKjVMVGDI 

GKKLGQNGLDNGFAMFHKVRVPRQSLLNRMGD 

VTPEGTYVSPFKDVRQRFGASLGSLSSGRVSIVSL 

AILNLKLAVAIALRFSATRRQFGPTEEEEIPVLEY 

PMQQWRLLPYLAAVYALDHFSKSLFLDLVELQR 

GLASGDRSARQAELGREIHAL AS ASKPLAS WTT 

QQGIQECREACGGHGYLAMNRLGVLRDDNDPN 

CTYEGDNN1LLQQTSNYLLGLLAHQVHDGACFR 

SPLKSVDFLDAYPGILDQKFEVSSVADCLDSAVA 

LAAYKWLVCYLLRETYQKLNQEKRSGSSDFEAR 

NKCQVSHGRPLALAFVELTVVQRFHEHVHQPSV 

PPSLRAVLGRLSALYALWStSRHAALLYRGGYF 

SGEQAGEVLESAVLALCSQLKDDAVALVDVIAP 

PDFVLDSPIGRADGELYKNLWGAVLQESKVLER 

ASWWPEFSVNKPVIGSLKSKL 


3809 


A 


117 

* 


830 


CFGIMERVGCTLTTTYAHPRPTPTNFLPAISTMAS 

SYRDRFPHSNLTHSLSLPWRPSTYYKVASNSPSV 

APYCTOSQRVSENTMIJ>FVShre.TrFFTRYTPDDW 

YRSNLTNYQESNTSRHNSEKLRVDTSRLIQDKYQ 

QTRKTQADTTQNLGERVNDIGFWKSE1IHEL DEM 

IGETNALTDVKKRLERALMETEAPLQVARECLF 

HREKRMGIDLVHDEVEAQLLTVNVGEMHQSQA 

A 


3810 


A 


3 


518 


VIQELEGGSGADLGEHSCRPASQPRFPRPAEARS 

HPATRRPASGPAMGKTNSKLAPEVLEDLVQNTE 

FSEQELKQWYKGFLKDCPSGILNT .KKFQQLYIKF 

FPYGDASKFAQHAFRTFDKNGDGTIDFREFICAL 

SVTSRGSFEQKLNWAFEMYDLDGDGRTTRLEML 

EIIE 


3811 


A 


81 


1147 


GCGYGCSGAGGAAIGEPMAKWGEGDPRWIVEE 

RADATNTVNNWHWTERDASNWSTOKXKTLFLAV 

QVQNEEGKCEVTEVSKLDGEASINNRKGKLEFFY 

EWSVKLNWTGTSKSGVQYKGHVEIPNLSDENSV 

DEVEISVSLAKDEPDTNLVALMKEEGVKLLREA 

MG1YISTLKTEFTQGMILPTMNGESVDPVGQPAL 

KTEERKAKPAPSKTQARPVGVKIPTCK1TLKETFL 

TSPEEL YRVFTTQEL VQ AFTHAPATLEADRG GKF 

HMVDGNVS GEFTOL VPEKMVMK WRFKS WPEG 

HFATITLTF1DKNGETELCMEGRGIPAPEEERTRQ 

GWQRYYFEGIKQTFGYGARLF 


3812 


A 


20 


558 


PCGTAASTHAYDRRAKCRQQQQQQQNGGQNKV 
RPAKKKTSPAREVSSESGTSGQFTPPSSTSVPTIAS 
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SEQID 

MO* 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=AJanlne OCysteine, D^Aspartic Add, 
EXviutamic Acid, F^rnenyiaiamnet O^urycine, H«*Hisndlnt, 
1-lsoleudne, K^Lyslne, L»Leudne, M 8 Methionine, 
N=Asparagine, P*»Prolioe, Q=€lutamine, R=Arglnine t S^Serine, 
T^Threonine, V=*Voline t W=Tryptophan, Y«Tyrosine, 
X-Unknown, *«Stop codon, /-possible nucleotide deletion, 
V>possible nucleotide insertion 










SSAPVSIWSPASISPLSDPLSTSSSCMQRSYPMTYT 
QASGYSQGYAGSTSYFGGMDCGSYLTPMHHQL 
PGPGATLSPMGTNAVTSHLNQSPASLSTQGYGAS 
KLWGFNFNH 


3813 


A 


1 


1016 


CTEPPRRSTRTPAALASLRPYTDYWVSDQILQES 

EDFFTLIESHEGKPLKXMVYNSKSDSCREVTVTP 

NAAWGGEGSLGCGIGYGYLHWPTQPPSYHKKPP. 

GTPPPSALPLGAPPPDALPPGPTPEDSPSLETGSRQ 

SDYMEALLQAPGSSMEDPLPGPGSPSHSAPDPDG 

LPHFMETPLQPPPPVQRVMDPGFLDVSGISLLDN 

SNASVWPSLPSSTELT1TAVSTSGPEDICSSSSSHE 

RGGEATWSGSEFEVSFLDSPGAQAQADHLPQLT 

LPDSLTSAASPEDGLSAELLEAQAEEEPASTEGLD 

TGTEAEGLDSQAQISTTE*HPGL*QGP 


3814 


A 


2 

■ 

• 


884 


VFWQVRNAGSSPLSAACPLFRTPAPQPCGSWGR 

CCIPHASTGCRPMAERGELDLTGAKQNTGVWLV 

KVPKYLSQQWAKASGRGEVGKLRIAKTQGRTE 

VSFTLNEDLANIHDIGGKPASVSAPREHPFVLQSV 

GGQTLTVFTESSSDKLSLEGIVVQRAECRPAASE 

NYMRJLKRI.QIEESSKPVRLSQQLDKVVTTNYKP 

VANHQYNIEYERKKKEDGKKARADKQHVLDML 

FSAFEKHQYYNLKDLVDITKQPWYLKEILKEIG 

VQ^KGIHKNTWELKPEYRHYQGEEKSD 


3815 


A 


17 


411 


NIGDWEDIGKSPERIIQYYGPATWAQDGSRGYCT 
PIYMLNrniRLQAVLEIIMNERANALDLI^QQTTK 
MRNANYQNRLALDYLLAHEGGV*GKFSLTNCC 
LEIDDNGKAIMEITARMRKLAHIPVQTWER 


3816 


A 


3 


1172 


SHWQRRDRRCVRNMAERGRKRPCGPGEHGQRI 

EWRKWKQQKKEEKKKWKDLKLMKKLERQRAQ 

EEQAKRLEEEEAAAEKEDRGRPYTLSVALPGSIL 

DNAQSPELRTYLAGQIARACAIFCVDEIVVFDEE 

GQDAKTVEGEFTGVGKKGQACVQLARBLQYLEC 

PQYLRKAFFPKHQDLQFAGLLNPLDSPHHMRQD 

EESEFREGVWDRPTRPGHGSFVNCGMKKEVKI 

DKNLEPGLRVTVRLNQQQHPDCKTYHGKWSS 

QDPRTKAGLYWGYTVRLASCLSAVFAEAPFQDG 

YDLTIGTSERGSDVASAQLPNFRHALVVFGGLQG 

LEAGADADPNLEVAEPSVLFDLYVNTCPGQGSR 

TIRTEEAILISLAALQPGLIQAGARHT 


3817 


A 


246 


1197 


FLSAGMSNFTHYAYLLMBESLMLGKVPPHVPSH 

HFIFHDDGSARQKGESDYKVnQQWFSKSGPWTT 

SSNVTWGLLELQQSISBSAVLTIPPGDSGAGSNLI 

TMFLRNRKETDLCSGRSKVNRGWNSGRCKQRG 

KTEQPGEPLEHVYVTKHAVALESRHQKGELQC 

LIKMCIPLSKPLQMFFSPPHWEAWLQRVQQLAJC 

NTRYFRQRLQEMGFTIYGNENASWPLLLYMPG 

KVAAFARHMLEKKIGVVVVGFPATPLAEARARF 

CVSAAHTREMLDTVLEALDEMGDLLQLKYSRH 

KKSARPELYDETSFELBD 


3818 


A 


215 


789 


NPQSSSSEGSSEIFQVNGHNRLLVQRSEVTQAPG 

QYTVDVEGHGCTHQA1T-KYNVLLPKKASGFSLS 

l^rVKKY'SSTAFDLTVTLKYTGIRNKSSMVVIDV 

KMLSGrTTT^SSIEELEI^GQVMKTEVKNDHVL 

FYLENVFGRADSFTFSVEQSNLVFNIQPAPGMVY 

DYYEKEEYALAFYHINSSSVSE 



442 



WO 01/57190 



PCT/US01/04098 



SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A a Aianine OCystdne, D^Aspartic Acid, 
E^GIutamic Add, ^Phenylalanine, G«G!ydne, HHHIstidine, 
Msoleudne, K*=Lysine, L^Leudne, M=Mcthlonine, 
N-Asparagine, P=Proline, Q=Glutamine, R»Arginine, 5=$erine, 
T-Threonine, V^Valine, W^Tryptophan, Y=Ty rosin e, 
X<*Unknown, *«Stop codon, /^possible nucleotide deletion, 
Wpossible nucleotide insertion 


3819 


A 


1 


1483 


MPDSUSRGVQGLPRDTASLSTTPSESPRAQATSR 

LSTASCPTPKVQSRCSSKENILRASHSAVDITKVA 

RRHRMSPFPLTSMDKAFITVLEMTPVLGTOIINYR 

DGMGRVLAQDVYAKDNLPPFPASVKDGYAVRA 

ADGPGDRFIIGESQAGEQPTQTVMPGQVMRVTT 

GAPIPCGADAVVQVEDTEURESDDGTEELEVRIL 

VQARPGQDIRPIGHDIKRGECVLAKGTHMGPSEI 

GLLATVGVTEVEVNKFPVVAVMSTGNELLNPED 

DLLPGKIRDSNRSTLLATIQEHGY PT1NLGIVGDN 

PDDLLNALNEGISRADVnTSGGVSMGEKDYLKQ 

VLDIDLHAQIHFGRVFMKPGLF11YATLDIDGVR 

KlffALPGNPVSAWTCNLFVVPALRKMQGILDP 

RPTIIKARLSCDVKLDPRPEYHRCILTWHHQEPLP 

WAQSTGNQMSSRLMSMRSANGLLMLPPKTEQY 

VELHKGEWDVMVIGRL 


3820 


A 

• 


2216 


487 


PQEPALKSEFSQVASNTIPLPLPQPNTCKDNGPCK 

QVCSTVGGSA1CSCFPGYAJMADGVSCEDQDECL 

MGAHDCSRRQFCVNTLGSFYCVNHTVLCADGYI 

LNAHRKCVDINECVTDLHTCSRGEHCVNTLGSF 

HCYKALTCEPGYALKDGECEDVDECAMGTHTC 

QPGFLCQNTKGSFYCQARQRCMDGFLQDPEGNC 

VDINECTSLSEPCRPGFSCINTVGSYTCQRNPLIC 

ARGYHASDDGTKCVDVNECETGVHRCGEGQVC 

HNLPGSYRCDCKAGFQRDAFGRGCIDVNECWAS 

PGRLCQHTCENTLGSYRCSCASGFLLAADGKRC 

EDVNECEAQRCSQECANIYGSYQCYCRQGYQLA 

EDGHTCTDIDECAQGAGILCTFRCLNVPGSYQCA 

CPEQGYTMTANGRSCKDVDECALGTHNCSEAET 

CHNIQG SFRCLRFECPPNYVQ VSKTKCERTTCHD 

FLECQNSPARITIT^QLNFQTGLLVPAHIFRIGPAP 

AFTGDTIALNIIKGNEEGYFGTRRLNAYTGWYL 

QRAVLEPRDFALDVEMKLWRQGSVn'FLAKMHl 

FF1TFAL 


3821 


A 


2216 


487 

• 


PQEPALKSEFSQVASNTEPLPLPQPNTCKDNGPCK 

QVCSTVGGSAICSCFPGYAIMADGVSCEDQDECL 

MGAHDCSRRQFCVNTLGSFYCVNHTVLCADGY1 

LNAHRKCVDINECVTDLHTCSRGEHCVNTLGSF 

HCYKALTCEPGYALKDGECEDVDECAMGTHTC 

QPGFLCQNTKGSFYCQARQRCMDGFLQDPEGNC 

VDINECTSLSEPCRPGFSCINTVGSYTCQRNPLIC 

ARGYHASDDG1TCCVDVNECETGVHRCGEGQVC 

HNLPGSYRCDCKAGFQRJDAFGRGCIDVNECWAS - 

PGRLCQHTCENTLGSYRCSCASGFLLAADGKRC 

EDVNECEAQRCSQECANIYGSYQCYCRQGYQLA 

EDGHTCTDIDECAQGAGILCTFRCLNVPGSYQCA 

CPEQGYTMTANGRSCKDVDECALGTHNCSEAET 

CHNIQGSFRCLRFECPPNYVQVSKTKCERTTCHD 

FLECQNSPARITHYQLNFQTGLLVPAHEFRIGPAP 

AFTGDTIALNnKGNEEGYFGTRRLNAYTGWYL 

QRAVLEPRDFALDVEMKLWRQGSVTTFLAKMHI 

FK1TFAL 


3822 


A 

• 


2502 


1540 


MAAATRGCRPWGSLLGLLGLVSAAAAAWDLAS 
LRCTLGAFCECDFRPDLPGLECDLAQHLAGQHL 
AKALVVKALKAFVRDPAPTECPLVLSLHGWTGTG 
KSYVSSLLAHYLFQGGLRSPRVHHFSPVLHFPHP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanioe OCystelne, D=vVspartic Acid, 
E~Clutamlc Acid, ^Phenylalanine, G=Glydne, H^Histidlne, 
I«4soleudne, K-Lysine, t^Leucine, M=MethIonine, 
N^Asparagtne, P«Proline, Q=Glutaminc, R=Arginine, S=Scrinc, 
l^Threonine, V»Vaiine, W«Tryptophan, Y^Tyrosine, 
X«Unknown, *«=Stop codon, /^possible nucleotide deletion, 
V=possib!e oucleotide insertion 






• 




SHIERYKKDLKSWVQGNLTACGRSLFLFDEMDK 

MPPGLMEVLRPFLGSSWVWGT>ryTlKAlFIHSN 

TGGEQINQVALEAWRSRRDREEILLQELEPVISR 

AVLDNPHHGFSNSGIMEERLLDAVVPFLPLQRHH 

VRHCVLNELAQLGLEPRDEWQAVLDSTTFFPE 

DEQLFSSNGCKTVASRIAFFL 


3823 

• 


A 


1 


3174 


YGOEKTTBGRIPLKNIYRLFSADRKRVETALEAC 

SI^SSRNDSBPQEDFTPEVYRVFLNNLCPRPEIDNI 

FSEFGAKSKPYLTVDQMMDFIhTLKQRDPRLNEIL 

YPPLKQEQVQVLIEKYEPNNSLARKGQISVDGFM 

RYLSGEENGVVSPEKLDLNEDMSQPLSHYFINSS 

HNTYLTAGQLAGNSSVEMYRQVLLSGCRCVELD 

CWKGRTAEEEPV1THGFTMTTEISFKEVIEAIAEC 

AFKTSPFPILLSFENHVDSPKQQAKMAEYCRLIFG 

DALLMEPLEKYPLESGVPLPSPMDLMYKILVKN 

KKKSHKSSEGSGKKKLSEQASNTYSDSSSMFEPS 

SPGAGEADTESDDDDDDDDCKKSSMDEGTAGSE 

AMATEEMSNLVNYIQPVKFESFEISKKRNKSFEM 

S SFVETKGLEQLTKS P VEF VE YNKMQLSRIYPKG 

TRVDSSNYMPQLFWNAGCQMVALNFQTMDLA 

MQINMGMYEYNGKSGYR1JCPEFMRRPDKHFDP 

FTEGIVDGIVANTLSVKIISGQFLSDKKVGTYVEV 

DMFGLPVDTRRKAFKTKTSQGNAVNPVWEEEPI 

VFKJCVVLPTLACLRIAVYEEGGKFIGHRILPVQAI 

RPGYHYICLRNEKNQPLTLPAVFVYIEVKDYVPD 

TYADVIEAI^NPIRYVNLMEQRAKQLAALTLEDE 

EEVKKEADPGETPSEAreEARTTPAENGVNHTTT 

LTPKPPSQALHSQPAPGSVKAPAKTEDLIQSVLTE 

VEAQTIEELKQQKSFVKLQKKHYKEMKJDLVKR 

HHKKTTOLIKEHTTKYNEIQNDYLRJIRAALEKS 

AKKDSKKKSEPSSPDHGSSTEEQDLAALDAEMTQ 

KJLrlDLKDKQQQQLLNLRQEQYYSEKYQKREHTK 

LLIQKLTDVAEECQNNQLKKLKEICEKEKKELKK 

KMDKKRQEKITEAKSKDKSQMEEEKTEMIRSYI 

QEWQYIKRLEEAQSKRQEKLVEKHKBIRQQILD 

EKPKLQVELEQEYQDKFKRLPLEILEFVQEAMKG 

KISEDSNHGSAPLSLSSDPGKVNHKTPSSEELGGD 

CPGKEFDTPL 


3824 


A 


1 


426 


ILHWFVHRWSGRNNREKIGVHVGFEEILNMEPY 

CCRETLKSLRPECFIYDLSAVVMHHGKGFGSGH 

YTAYCYNSEGGFWVHCNDSKLSMCTMDEVCKA 

QAYILFYTQRVTENGHSKLLPPELLLGSQHPNED 

ADTSSNEILS 


3825 


A 


3 


364 


GIRAKFPNKlPVVVERYPRETFLPPLDK'nCFLVPQ 
ELTMTQFLSnRSRM\HLiCATEAFYLLVNNKSLVS 
MSATN1AEIYRD YKDEDGF VYMTY A SQETFGCLE 
SAAPRDG S SLEDRPLHPL 


3826 


A 


1 


1237 


PEKKFERECREAEKAQQSYERLDNDTNATKADV 

EKAKQQLNLRTHMADENKNEYAAQLQNFNGEQ 

HKHFYVVIPQryXQLQEMDERRTIKLSECYRGFA 

DSERKVIPIISKCLEGMILAAKSVDERRDSQMVV 

DSFKSGFEPPGDFPFEDYSQHIYRTISDGTISASKQ 

ESGKMDAKTTVGKAKGKLWLFGKKPKGPALED 

FSHLPPEQRRKKLQQRIDELNRELQKESDQKDAL 

>HCMKDVYEK14PQMGDPGSLQPKLAETMNNIDR 



444 



WO 01/57190 PCT/US01/04098 



SEQ1D 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location - 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


| Predicted end 
1 nucleotide 
1 location 
1 corresponding 
1 to last amino 
J add residue of 
| peptide 
sequence 


Amino acid sequence (A«AJanine OCysteinc, D=>Aspartic Add, 
E^GIutamic Add, F-Phenyl alanine, G^GIycine, H=Histidine, 
I^lsolcucine. lO^L/vsine* (/"Leucine. M s lVf ethioninp 
N»Asparaginc, P-Proline, Q=G!utam{ ne, R=Argininc, S^Serine, 
T-Threonine, V«Valine, W»Tryptophan, Y»Tyrosine, 
X-Unknown, *«*Stop codon, /^possible nudeotide deletion, 
^possible nudeotide insertion 










LRMEIHKNEA WLSE VEGKTGGRGDRRHSSDINH 

LVTQGRESPEGSYTDDANQEVRGPPQQHGHHNE 

roDEFEDDDPLPAIGHCKAlYPTOGHNEGTLAMK 

EGEVLYIIEEDKGDGWTRARRQNGEEGYVPTSYI 

DVTLEKNSKGS 


3827 


A 


2 


1584 


INPVSSAVNGEAHSSHETRGQNSNALPSVLLELL 

SQSCLIPAMSSYLRNDSVLDMARHVPLYRALLEL 

LRAIASCAAMVPLLLPLSTENGEEEEEQSECQTS 

VGTLLAKMKTCVDTYTNRLRSKRENVKTGVKP 

DASDQEPEGLTLLVPDIQKTAEIVYAATTSLRQA 

NOEKKLGEYSKKAAMKPKPLSVLKSLEEKYVAV 

MKKLQroTFEMVSEDEDGKLGFKVNYHYMSQV 

KNANDANSAARARRLAQEAVTLSTSLPLSSSSSV 

FVRCDEERLDIMKVLITGPADTPYANGCFEFDVY 

FPQDYPSSPPLVNLETTGGHSVRFNPNLYNDGKV 

CI^ILNTWHGRPEEKWNPQTSSFLQVLVSVQSLI 

LVAEPYFNEPGYERSRGTPSGTQSSREYDGN1RQ 

ATVKWAMLEQIRNPSPCFKEVIHKHFYLKRVEIM 

AQCEEW1ADIQQYSSDKRVGRTMSHHAAALKRH 

TAQLREELLKLPCPEGLDPDTDDAPEVCRATTGA 

EETLMHDQVKPSS SKELPSDFQL 


3828 


A 


1415 


845 


PRVPATLVSLDPWHCFPTAGRLAGSTWVPPACT 

LQLGPSSEHELDNHRAPLLSLPSQESLSFTPWYLV 

ACKPLFHIFCPLFACFMQEGKVQYLFLHLSHMRL 

LNYYFFPFLAPESLMQALEDLDYLAALDNDGNL 

SEFGIIMSEFPLDPQLSKSELASCEFDCVDEVLTIA 

AMVTGILNDYSFSFFANLH 


3829 


A 


199 


683 

• 


VDHTPVLSKPQCFSSVKWGATLSARSQKTSGIGR 

LMVHVEEATELKACKPNGKSNPYCEISMGSQSYT 

TRTIQDTLNPKWNFNCQFFIKDLYQDVLCLTLFD 

RDQFSPDDFLGRTEIPVAKIRTEQESKGPMTRRLL 

LHEVPTGEVWVRFDLQLFEQKTLL 


3830 


A 


1747 


404 


RKMMEESGJETTPPGTPPPNPAGLAATAMSSTPV 

PLAATSSFSSPNVSSMESFPPLAYSTPQPPLPPVRP 

SAPLPFVPPPAVPSVPPLVTSMPPPVSPSTAAAFG 

NPPVSHFPPSTSAPNTLLPAPPSGPP1SGFSVGSTY 

DITRGHAGRAPQTPLMPSFSAPSGTGLLPTPITQQ 

ASLTSLAQGTGTTSAITFPEEQEDPRITRGQDEAS 

AGGIWGFKGVAGNPMVKSVLDKTKHSVESMIT 

TLDPGMAPYDCSGGELDIVVTSNKEVKVAAVRD 

AFQEVFGLAVWGEAGQSNIAPQPVGYAAGLKG 

AQERIDSLRRTGVIHEKQTAVSVENFIAELLPDK 

WFDIGCLWEDPVHGIHLETFTQATPVPLEFVQQ 

AQSLTPQDYNLRWSGLLVTVGEVLEKSLLNVSR 

TDWHMAFTGMSRRQMIYSAARAIAGMYKQRLP 

PRTV 


3831 


A 


S 


674 


FWTRSAWHEGLQQMKANDPSLQEVNLYNDCNIP 

IPTLREFAKALETNTHVKKFSLAATRSNDPVA1AF 

ADMLKVKTTLTSLNIESHFITGTGILALVEALKEN 

DTLTEKIDNQRQQLGTAVEMEIAQMLEENSRIL 

KFGYQFTKQGPRTRVAAAITKNNDLAWQKDTQ 

EQTSIWQWSQSIAGFNPQFEVQGQNARSWMEE 

LGKAFHQFVRRELKQTEGKLP 


3832 


A 


164 


782 


EPWVPMDVAESPERDPHSPEDEEQPQGLSDDDIL 
RDSGSDQDLDG AG VRA SDLEDEES AARGPSQEE 
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SEQID 
NO: 

- •> 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»Alanlne OCysteine, D=Aspartie Acid, 
E=Glutaraie Acid, ^"Phenylalanine, G=Glycine, HHHIsttdlnc, 
I=lsoleudne, K^Lysine, L»Leurfne, M-Methionine, 
N^Asparagine, P»ProIlne, Q^GIutamlne, R-Argininc, S=Serine, 
T^Tbreonine, V=Valine, W^Tryptophan, Y^Tyroslne, 
X=Unknown, *=Stop codon, possible nucleotide deletion, 
V=possib1e nucleotide insertion 










EDNHSDEEDRASEPKSQDQDSEVNELSRGPTSSP 
CEEEGDEGEEDRTSDLRDEASSVTRELDEHF.T X)Y 
DEEVPEEPAPAVQEDEAEKAGAEDDEEKGEGTP 
REEGKAGVQSVGEKESLEAAKEKKKEDDDGEDO 
DEEMY 


3833 


A 


122 


1676 


SQPPHFTQKMNENKDTDSKKSEEYEDDFEKDLE 

WLINENEKSDASHEMACEKEENINQDLKENETV 

MEHTKRHSDPDKSLQDEVSPRKNDIISVPGIQPLD 

PISDSDSENSFQESKLESQKDLEEEEDEEVRRY1M 

EKIVQANKLLQNQEPV>nDKKERKLKrTCDQLVDL 

EWPLEDTTTSKNYFENERNMFGKLSQLCISNDF 

GQEDVLLSLTNGSCEENKDRTILVERDGKFELLN 

LQD1ASQGFLPPINNANSTENDPQQLLPRSSNSSV 

SGTKKEDSTAKIHAVTHSSTGEPLAYIAQPPLNR 

KTCPSSAVNSDRSKGNGKSNHRTQSAfflSPVTST 

YCLSPRQKELQKQLEEKREKLKREEERRKIEEEK 

EKKRENDIVFKAWLQKKREQVLEMRRIQRAKEI 

EDMNSRQENRDPQQAFRLWLKKKHEEQMKERQ 

TEELRKQEECLFFLKGTEGRERAFKQWLRRKRM 

EKMAEQQAVRERTRQLRLEAKRSKQLQHHLYM 

SEAKPFRFTDHYN 


3834 


A 


575 


774 


RSRTEELSNSGILKAMSKDLVTFGDVAVNFSQEE 
WEWLNPAQRNLYRKVMLENYRSLVSLGKDMSP 


3835 


A 


2 


100 


ASDFYLRYYVGHKGKFGHEFLEFEFRPDGVYV 


3836 


A 


91 


749 


RPTPGHGDFWMQPLTKDAGMSLSSVTLASALQV 

RGEALSBEEIWSLLFLAAEQLLEDLRNDSSDYW 

CPWSALLSAAGSLSFQGRVSHIEAAPFKAPELLQ 

GQSEDEQPDASQMHVYSLGMTLYWSAGFHVPP 

HQPLQLCEPLHSILLTMCEDQPHRRCTLQSVLEA 

CRVHEKEVSVYPAPAGLHIRRLVGLVLGTISEVS 

REPCFSSSSCWSCVAIKI 


3837 

• 


A 


3 

• 


1214 


SLGCTOSARGKGQDDEVRTLMANGAPFTTDWFS 

KLRVSCGYIGDNCKNGADVNAKDMLKMTALH 

WATERHHRDWELLIKYGADVHAFSKFDKSAFD 

LALEKKNAEILVILQEAMQNQVNVNPERANPVTD 

PVSMAAPFIFTSGEVVNLASLISSTNTKTTSGDPH 

ASTVQFSNSTTSVLATLAALAEASVPLSNSHRAT 

AOTEEnEGNSVDSSIQQVMGSGGQRVITIVTDGV 

PLGNIQTSIPTGGIGHPFIVTVQDGQQVLTVPAGK 

VAEETVIKEEEEEKLPLTKKPRIGEKTNSVEESKE 

GNERELLQQQLQEANRRAQEYRHQLLKKEQEAE 

QYRLKLEAIARQQPNGVDFIMVEEVAEVDAW 

VTEGELEERETKVTGSAGATGPPTRVSMATVSS 


3838 


A 


1 


1332 

• 


NflEDNKENKDHSLERGRASLIFSLKNEVGGLIKA 

LKIFQEKHVNLLHIESRKSKRRNSEFEIFVDCDIN 

REQLNDn ? HLLKSHT>na.SVNLPDNFTLKEDGME 

TVPWFPKKISDLDHCANRVLMYGSELDADHPGF 

KDNVYRKJU^YFADLAM>TVTCHGDPIPKVEFTEE 

EIKTWGTVFQELNKLYPTHACREYLKNLPLLSKY 

CGYREDNIPQLEDVSNFLKERTGFSIRPVAGYLSP 

RDFLSGLAFRVFHCTQYVRHSSDPFYTPEPDTCH 

ELLGHVPLLAEPSFAQFSQEIGLASLGASEEAVQ 

KLATCYFKJTVEFGLCKQDGQLRVFGAGLLSSISE 

LKHALSGHAKVKPFDPKITCKQECLITTFQDVYF 

VSESFEDAKEKMREFTKTDCRPFGVKYNPYTRSI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

j nucleotide 

j location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, D= As parti c Add, 
&=€lutamic Add, F=Pbenylalanine, G^GIydne, H=H3stidint, 
l=Iso leu cine, K=Lysine t L^Leudnc, M=>MethJonine, 
N^Asporagine, ^Proline, Q=Glutamine, R«Arglnine, S"Serine, 
T-Threonlne, V=Valine, W«Tryptophao, Y-^yrosine, 
X=Un known, *«=Stop codon, ^possible nudeotide deletion, 
V=possible nudeotide insertion 










QILKDTKSITSAMNELQHDLDVVSDALAKVSRKP 
SI 


3839 


A 


3093 


520 

• 


MVOTTVDQIRAIMDKKANIRKMSVIAHVDHGKS 

TLTDSLVCKAGIIASARAGETRFTDTRKDEQERCI 

TIKSTAISLFYELSENDLNFIKQSKI^GAGFLINLID 

SPGHVDFSSEVTAALRVTDGALVWDCVSGVCV 

QTET\^RQAIAERIKPVXJvIMNKMDRALLELQLE 

PEELYQTFQRIVEKVNVnSTYGEGESGPMGNn^I 

DP\1.GTVGFGSGLHGWAFTLKQFAEMYVAKFA 

AKGEGQLGPAERAKKVEDMMKXJLWGDRYFDP 

ANGKFSKSATSPEGKKLPRTFCQLILDPDFKVFDA 

IMOTKKEETAKLEEKLDIKLDSEDKDKEGKPLLK 

AVMRRWLPAGDALLQMITIHLPSPVTAQKYRCE 

LLYEGPPDDEAAMGIKSCDPKGPLMMY1SKMVP 

TSDKGRFYAFGRVFSGLVSTGLKVRIMGPNYTPG 

KKEDLYLKPIQRTILMMGRYVEPIEDVPCGNIVG 

LVGVDQFLVKTGTITTFEHAHNMRVMKFSVSPV 

VRVAVEAKOTADLPKLVEGLKRLAKSDPMVQCI 

IEESGEHI1AGAGELHLE1CLKDLEEDHACIPIKKS 

DPVVSYRETVSEESNVLCLSKSPNKHNRLYMKA 

RPFPDGLAEDIDKGEVSARQELKQRARYLAEKY 

EWDVAEARKIWCFGPDGTGPNELTDrnCGVQYL 

NEIKDSWAGFQWATKEGALCEENMRGVRFDV 

HDVTLHADAIHRGGGQIIPTARRCLYASVLTAQP 

RLMEPIYLVE1QCPEQVVGGIYGVLNRKRGHVFE 

ESQVAGTPMFWKAYLPVNESFGFTADLRSNTG 

GQAFPQCVFDHWQILPGDPFDNSSRPSQWAETR 

KRKGLKEGIPALDNFLDKL 


3840 


A 


2 


753 


SSTRSRDFCCSEAIQGSLTRRERRASGVRTRRSQG 

SSAMASKILLNVQEEVTCPICLELLTBPLSLDCGH 

SLCRACITVSNKEAVTSMGGKSSCPVCGISYSFE 

HLQANQHLANIVERLKEVKLSPDNGKKRDLCDH 

HGEKLLLFCKEDRKVICWLCERSQEHRGHHTVL 

TEEVFKECQEKLQAVLKRLKKEEEEAEKLEADIR 

EEKTSWKYQVQTERQRIQTEFDQLRSILNNEEQR 

ELQRLEEEEKKT 


3841 


A 


2 


405 


GKAFSCFTYI^QHRRIHMAEKPYBCKTCKKAFS 

HFGNLK.VHEREHTGEKPYECKECRKAFSWLTCL 

LRHERIHTGKKSYECQQCGKAFTRSRFLRGHEKT 

HTGEKMHECKECGKALSSLSSLHRHKRTHWRDT 

L 


'3842 


A 


311 


88 


AVLKJNMAPMTALGLLD1JHILNLILFLSAGEDFTS 
WSEIMMYILLVFLTLWLLIEMIYCYRKVSKAEE 
AAQENA 


3843 


A 


3 


1175 


APIRNSRIDDFVRRVESKATSARCGLWGSGPRRR 

PASGMFRGLSSWLGLQQPVAGGGQPNGDAPPEQ 

PSETVAESAEEELQQAGDQELLHQAKDFGNYLF 

NFASAATKXITESVAETAQTIKKSVEEGKIDGIID 

KTnGDFQKEQKKFVEEQHTKKSEAAVPPWVDT 

NDEETIQQQILALSADKRNFLRDPPAGVQFNFDF 

DQMYPVALVMLQEDEU^KMRFALVPKLVKEE 

VFWRNYFYRVSLIKQSAQLTALAAQQQAAGKEE 

KSNGREQDLPLAEAVRPKTPPWIKSQLKTQEDE 

EEISTSPGVSEFVSDAFDACNLNQEDLRKEMEQL 

VLDKKQEETAVLEEDSAJDWEKELQQELQEYEV 
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SEQU> | Method 
NO: 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A B Alanine OCysteine, D»Aspartic Add, 
E-GIutamfc Add, ^Phenylalanine, OGIyclne, H-Histidinc, 
l^Isoleucine, K=Lysine, Leu cine, M«Methionine, 
N=Asparagtne, P*»Proline, Q=G lata mine, R-Arginine, S=Scrine, 
T=^Threonine, V«VaIine, W^Tryptophan, Y«*Tyrosine, 
X=Unknown, #a Stop codon, A=possible nucleotide deletion, 
Wpossible nucleotide insertion 



VTES EKRDENWDKEIEKMLQEEN 



3844 



798 



148 



LPPAQIPEAWLLLANWVVLILVPLKDRLIDPLLL 

RCKLLPSALQKMALGMFFGFTSVTVAGVLEMER 

LHYIHHNETVSQQIGEVLYNAAPLSIWWQIPQYL 

LIGISEIFASIPGIJBFAYSEAPRSMQGAIMGIFFCLS 

GVGSLLGSSLVALLSLPGGWLHCPKDFGNINNCR 

MDLYFFLLAGIQAVTALLFVWIAGRYERASQGP 

ASHSRFSRDRG 



3845 



1934 



PEDSAPQYSRLFPNASQHITPSYNYAPNPDKHWI 

MRYTGPMKPIHMEFTNMLQRKRLQTLMSVDDS 

METIYNMLVETGELDNTYIVYTADHGYHIGQFG 

LVKGKSMPYEFDIRVPFYVRGPNVEAGCLKPHIV 

12OT)1^PTILDIAGLDIPADMDGKSIIJCIXDTERP 

VNRFHLKKKMRVWRDSFLVERGK1XHKJRDNDK 

VDAQEENFLPKYQRVKDLCQRAEYQTACEQLG 

QKWQCVEDATGKLKLHKCKGPMRLGGSRALSN 

LVPKYYGQGSEACTCDSGDYKLSLAGRRKKLFK 

KKYKASYVRSRSIRSVAIEVDGRVYHVGLGDAA 

QPRNLTKRHWPGAPEDQDDKDGGDFSGTGGLP 

DYSAANPIKVTHRCYILENDTVQCDLDLYKSLQ 

AWKDHKLHIDHEIETLQNKIKNLREVRGHLKKK 

RPEECDCHKJSYHTQHKGRLKHRGSSLHPFRKGL 

QEKDKVWLLREQKRKKKLRKLLKRLQNNDTCS 

MPGLTCFTHDNQHWQTAPFWTLGPFCACTSAN 

NNTYWCMRTINETHNFLFCEFATGFLEYFDLNT 

DPYQLMNAVNTLDRDVLNQLHVQLMELRSCKG 

YKQCNPRTRNMDLGLKDGGSYEQYRQFQRRKW 

PEMKRPS SKSLGQLWEG WEG 



3846 



1934 



PEDSAPQYSRLFPNASQHITPSYNYAPNPDKHWI 
MRYTGPMKPIHMErTT^MLQRKRLQTLMSVDDS 
METIYNML VETGELDNTYIVYTADHG YHIG QFG 
LVKGKSMPYEFDIRVPFYVRGPNVEAGCLNPfflV 
LNIDLAPTTLDIAGLDIPADMDGKSILKLLDTERP 
VNRFHLKKKMRVWRDSFLVERGKLLHKRDNDK 
VDAQEENFLPKYQRVKDLCQRAEYQTACEQLG 
QKWQCVEDATGKLKLHKCKGPMRLGGSRALSN 
LVPKYYGQGSEACTCDSGDYKLSLAGRRKKLFK 
KKYKASYVRSRSIRSVAEBVDGRVYHVGLGDAA 
QPRNLTKRHWPGAPEDQDDKDGGDFSGTGGLP 
DYSAANPIKVTHRCYILENDTVQCDLDLYKSLQ 
AWKDHKLHIDHEIETLQNKIKNLREVRGHLKKK 
RPEECDCHKISYHTQHKGRLKHRGSSLHPFRKGL 
QEKDKVWLLREQKRKKKJLRKLLKRLQNNDTC S 
MPGLTCTTHDNQHWQTAPFWTLGPFCACTSAN 
NNTYWCMRTO^THNFLFCEFATGFLEYFDLNT 
DPYQU^AVNTXDRDVLNQLHVQLMELRSCKG 
YKQCNPRTRNMDLGLKDGGSYEQYRQFQRRKW 
PEMKRPSSKSLGQLWEGWEG 



3847 



1257 



MVFSAVLTAFHTGTSNTTFVVYENTYMNITLPPP 

FQHPDLSPLLRYSFETMAPTGLSSLTVNSTAVPTT 

PAAFK5LNLPIX3ITLSAIMIFILFVSFLGNLVVCLM 

VYQKAAMRSAJNILLASLAFADMLLAYLNMPFA 

LVTILTTRWIFGKFFCRVSAMFFWLFVIEGVAILL 

HSIDRFLIIVQRQDKLNPYRAKVLIAVSWATSFCV 

AFPLAVGNPDLQIPSRAPQCVFQYTrNPGYQAYV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, D»Aspartic Add, 
E«=Glutamlc Add, ^Phenylalanine, OGlydne, H^Hlstidine, 
I=Isofeucine, K=Lysine, L= Leucine, {^Methionine, 
N=As par agin e, P=ProlIne, Q^GIutamine, R«Arginlne, S=Serlne, 
'MTireonine, V«Valine, W«Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










ILISLISFFIPFLVILYSFMGILNTLRHNALRIHSYPE 

GICLSQASFCLGLMGLQRPFQMSIDMGFKTRAFTT 

ILILFAVFIVCWAPFTTYSLVATFSKOTYYQHNFF 

EISTWLLWLCYLKSALNPLIYYWRIKKFHDACLD 

MMPKSFKFLPQIJ>GHTKRRIRPSAVYVCGEHRT 

W 


3848 


A 


3 


2827 


SSAVAARKKRSWASLVLAFLGVCLGITLAVDRS 

NFKTCEESSFCKRQRSIRPGLSPYRALLDSLQLGP 

DSLTVHLIHEVTKVLLVLELQGLQKNMTRFRIDE 

LEPRRPRYRVPDVLVADPPIARLSVSGRDENSVE 

LTMAEGPYKIILTARPFRLDLLEDRSLLLSVNARG 

LLEFEHQRAPRVSQGSKJDPAEGDGAQPEETPRD 

GDKPEETQGKAEKDEPGAWEETFKTHSDSKPYG 

PMSVGLDFSLPGMEHVYGIPEHADNLRLKVTEG 

GEPYRLYNLDVFQYELYNPMALYGSVPVLLAHN 

PHRDLGIFWLNAAETWVDISSNTAGKTLFGKMM 

DYLQGSGETPQTDVRWMSETGIIDVFLLLGPSISD 

VFRQYASLTGTQALPPLFSLGYHQSRWNYRDEA 

DVLEVDQGFDDHNLPCDVIWLDIEHADGKRYFT 

WDPSRFPQPRTMLERLASKRJRJCLVAIVDPHIKVD 

SGYRVHEELRNLGLYVKTRDG SDYEG WC WPGS 

AGYPDFTKfTMRAWWANMFSYD>ryEGSAPNLF 

VWNDMNEPSVFNGPEVTMLKDAQHYGGWEHR 

DYHNIYGLYVHMATADGLRQRSGGMERPFVLA 

RAFFAGSQRFGAVWTGDNTAEWDHLKISIPMCL 

SLGLVGLSFCGADVGGFFKNPEPELLVRWYQMG 

AYQPFFRAHAHLDTGRREPWLLPSQHNDURDAL 

GQRYSLLPFWYTLLYQAHREG1PVMRPLWVQYP 

QDV1-1FNIDDQYLLGDALLVHPVSDSGAHGVQV 

YLPGQGEVWYDIQSYQKHHGPQTLYLPVTLSSIP 

VFQRGGT1 VPRWMRVRRS SECMKDDPITLF V ALS 

PQGTAQGELFLDDGHTFNYQTRQEFLLRRFSFSG 

NTLVSSSADPEGHFETPIWIERVVnGAGKPAAW 

LQTKGSPESRLSFQHDPETSVLVLRKPGINVASD 

WSIHLR 


3849 


A 


1 


1717 

• 

« 


RARNARGC WG VCRSGF S SAVCG AARMEQVAEG 

ARVTAVPVSAADSTEELAEVEEGVGWGEDNDA 

AARGAEAFGDSEEDGEDVFEVEKJLDMKTEGGK 

VLYKVRWKGYTSDDDTWEPE1HLEDCKEVLLEF 

RKKIAENKAKAVRKDIQRLSLNNDIFEANSDSDQ 

QSETKEDTSPKKKXKJCLRQREEKSPDDLKKKKA 

KAGKLKDKSKPDLESSLESLVFDLRTKKRISEAK 

EELKESKKPKKDEVKETKELKKVKKGEIRDLKT 

KTREDPKENRKTKKEKFVESQVESESSVLNDSPF 

PEDDSEGLHSDSREEKQNTKSARERAGQDMGLE 

HGFEKPLDSAMSAEEDTDVRGRRKKKTPRKAED 

TRENRKLENKNAFLEKKTVPKKQRNQDRSKSAA 

ELEKLMPVSAQTPKGRRLSGEERGLWSTDSAEE 

DKETKRNESKKPKKDEVKETECELKKVKKGEIRD 

LKTKTREDPKENRKTKKEKFVESQVESESSVLND 

SPFPEDDSEGLHSDSREEKQNTKSARERAGQDM 

GLEHGFEKPLDSAMSAEEDTDVRGRRKKKTPRK 

AEDTRENRK1JENKNAFLEKKTVPKKQRNQDRSK 

S AAELEKLMP V S AQTPKG RRLSGEERGL WS TDS 

AEEDKETKRNESKKPKKDEVKJETK^UCXVKKGE 
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SEQID 
NO: 


Method 


Predicted 

becinninc 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, D^Aspartic Acid, 
fc=nintAfnic Arid F«=Phenvl alanine. G=*Clvcine H^HUHHtnp 

I^Isolcucine, K-Lysine, LHLeucine, M«Metbionine, 
N-Asparagine,P=ProJine, Q=Glutamine, R=»Arginine, S^Serine, 
T°Threonine, V-Valine, W^Tryptophan, Y-Tyrosine, 
X»Unknown, *^Stop codon, /"possible nucleotide deletion, 
V>possible nucleotide insertion 










IRDLKTKTREDPKENRKTKKEKFVESQVESESSV 
LNDSPFPBD/RQ*RATFRQQREEKSPDDLKKKKA 
KAGKLKDKSKPDLESSLESLVFDLRTKKRISEAK 
EELKESKKPK 


3850 


A 

* 


1113 

* 


3975 

• 


PAAAAAAAAAAAAAAGRGPSFTPCFSPSLAVEPS 

RRTRLGSDPAQAMAGNVKKSSGAGGGSGSGGS 

GSGGLIGLMKDAFQPHHHHHHHLSPHPPGTVDK 

KMVEKCWKLMDKVA01LCQNPKLALKNSPPYIL 

DLLPDTYQHLRTILSRYEGKMETLGENEYFRVF 

MENLMKKTKQTISLFKEGKERMYEENSQPRKNL 

TKl^LIFSHMLAELKGIFPSGLFQGDTFRITKADA 

AEFWRKAFGEKTIVPWKSFRQALHEVHPISSGLE 

AMALK5TTDLTCNDYISVFEFDll< tt lKLFQPWSSLL 

RInTVWSLAVTHPGYMAFLTYDEVKARLQKFIH^ 

GSYIFRLSCTRLGQWAIGYVTADGNILQTIPHNKP 

LFQAT JDGFREGFYLFPDGRNQNPDLTGLCEPTP 

QDHIKVTQEQYELYCEMGSTFQLCKICAENDKD 

VKIEPCGHLMCTSCLTSWQESEGQGCPFCRCEIK 

GTEPIWDPFDPRGSGSLLRQGAEGAPSPNYDDD 

DDERADDTLFMMKELAGAKVERPPSPFSMAPQA 

SLPPVPPRLDLLPQRVCVPSSASALGTASKAASGS 

LHKDKPLPVPPTLRDLPPPPPPDRPYSVGAESRPQ 

RRPLPCTPGDCPSRDKLPPVPSSRLGDSWLPRPIP 

KVPVSAPSS SDP WTGRELTNRHSLPFSLPSQMEP 

RPDVPRLGSTFSLDTSMSMNSSPLVGPECDHPK1 

KPSSSANAIYSLAARPLPVPKLPPGEQCEGEEDTE 

YMTPSSRPLRPLDTSQSSRACDCDQQIDSCTYEA 

MYNIQSQAPSITESSTFGEGNLAAAHANTGPEES 

ENEDDG YD VPKPPVPA VLARRTLSDISNA SSS/FG 

LFVLERDP*PQNVTEGSQVPERPPKPFPRRINSER 

KAGSCQQGSGPAASAATA\SPQLSSEIENLMSQG 

YSYQDIQKALVIAQNN1EMAKNILREFVSISSPAH 

VAT 


3851 


A 


2 

• 

* 


2781 


GRVGSMDGAMGPRGLLLCMYLVSLLILQAMPA 

LGSATGRSKSSEKRQAVDTAVDGVFIRSLKVNC 

KVTSRFAHYVVTSQWNTANEAREVAFDLEIPK 

TAFISDFAVTADGNAFIGDIKDKVTAWKQYRKA 

AISGENAGLVRASGRTMEQFTIHLTVNPQSKVTF 

QLTYEEVIJKRNHMQYEIVIKVKI'KQLVHHFEIDV 

DIFEPQGISKLDAQASFLPKELAAQT1KKSFSGKK 

GHVLFRPTVSQQQSCPTCSTSLLNGHFKVTYDVS 

RDKICDIXVAlWHFAHrTAPQNLTNM>^ 

roiSGSMRGQKVKQTKEALLKILGDMQPGDYFD 

LVLFGTRVQSWKGSLVQASEANLQAAQDFVRGF 

SLDEATh^NGGIJLRGIEILNQVQESLPELSNHASI 

LIMLTDGDPTEGVTDRSQILKNVRNAIRGRFPLY 

NLGFGHNVDFNFLEX^SMENNGRAQRIYEDrTO 

ATQQLQGFYSQVAKPLLVDVDLQYPQDAVLALT 

QNHHKQYYEGSEIWAGRIADNKQSSFKADVQA 

HGEGQEFSITCLVDEEEMKKLLRERGHNILENHV 

ERLWAYLT1QELJLAKRMKVDREVRANLSSQALR 

MSLDYGFVTPLTSMSIRGMADQDGLKPTIDKPSE 

DSPPLEMLGPRRTFVLSALQPSPTHSSSNTQRLPD 

RWGVDTDPHHIHVPQKEDTLCFNINEEPGVILS 

LVQDPNTGFSVNGQLIGNKARSPGQHDGTYFGR 



450 



WO 01/57190 PCT/US01/04098 



SEQ ID 

| NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A=Alanlne OCysteine, D»Aspartic Acid, 
E«=Glutamic Acid, ^Phenylalanine, G^GIycine, U»HlstidtQe, 
I=Isoleucine, K»Lysine, L=Leucine, M»Methionine, 
N^Asparaginc, P^Proline, Q^Glutamlnc, R«Arginine, S=Serine, 
T=Tbreonlne, V»Valine, W«=Tryptophan, Y«^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










LGIANPATDFQLEVTPQNITLNPGFGGPVFSWRD 
QAVLRQDGVVAmNKKRNLVVSVDDGGTF\EVV\ 
LHRVW\KGSS\VHQDFLGLLMCWDKSIGMSSPGR 
KGCWGQVFFHPIRFLKVS* HPPPGSDPQKAQMPT 
MWRNPP GLTVT\RGLQKD YSKDP WHGAEVSC 1 
WFI\HNNGA*ATDCAYTDY1\VPDIF 


3852 


A 


39 

* 


1735 

* 


TQVAEAGRGEGWAGAETGRPQSAGMNLELLES 

FGQNYPEEADGTLDCTSMALTCTFNRWGTLLAV 

GCNDGRJVIW\DFVLTRGIA*NKFSAHIHPVCSLC 

WSRDGHKLVSASTDNIVSQWDVLSGDCDQRFRF 

PSPILKVQYHPRDQNKVLVCPMKSAPVMLTLSD 

SKHWLP VDDDSDLNV VA SFDRRGE YIYTGNAK 

GKILVLKTDSQDLVASFRVTTGTSNTTAIKSIEFA 

RKGSCFLINTADRHRVYDGREILTCGRDGEPEPM 

QKLQDLVNRTPWKKCCFSGDGEYIVAGSARQH 

ALYIWEKSIGNLVKILHGTRGELLLDVAWHPVRP 

UASISSGWSIWAQNQVENWSAFAPDFKELDEN 

VEYEERESEFDIEDEDKSEPEQTGADAAEDEEVD 

VTSVDPIAAFCSSDEELEDSKALLYLPIAPEVEDP 

EENPYGPPPDAVQTSLMDEGASSEKKRQSSADG 

SQPPKJCKPKTTNIELQGVFNDEVHPLLGVKGDG 

KSKKKQAGRPKGSKGKEKDSPFKPKLYKGDRGL 

PLEGSAKGKVQAELSQPLTAGGAISELL 


3853 


A 


45 


2603 


PLLFTCGREVRARDPEKEGTIWAGLKVQVQPRF 

LWILCFSMEETQGELTSSCGSKTMANVSLAFRDV 

SIDLSQEEWECLDAVQRDLYKDVMLENYSNLVS 

LDLEYKYITKNLLSEKNVCBaYLSQLQTGEKSKN 

TIHEDTIFRNGLQCKHEFERQERHQMGCVSQMLI 

QKQISHPLHPKJHAREKSYECKECRKAFRQQSYLI 

QHLRJHTGERPYKCMECGKAFCRVGDLRVHHTI 

HAGERPYECKECGKAFRLHYHLTEHQRIHSGVK 

PYECKECGKAFSRVRDLRVHQTfflAGERPYECK 

ECGKAFRLHYQLTEHQRIHTGERPYECKVCGKT 

FRVQRHISQHQKJHTGVKPYKCNECGKAFSHGS 

YLVQHQKIHTGEKPYECKECGKSFSFHAELARH 

RRIHTGEKPYECRECGKAFRLQTELTRHHRTHTG 

EKPYECKECGKAFICGYQLTLHLRTHTGEIPYEC 

KECGKTFS SRYHLTQHYRIHTGEKPYICNECGKA 

FRLQGELTRHH3UHTCEKPYECKECGKAFIHSNQ 

FISHQRJHTSESTYICKECGKIFSRRYNLTQHFKIH 

TGEKPYICNECGKAFRFQTELTQHHR1HTGEKPY 

KCTECGKAFIRSTHLTQHHRIHTGEKPYECTECG 

KTFSRHYHLTQHHRGHTGEKPYICNECGNAFICS 

YRLTLHQRIHTGELPYECKECGKTFSRRYHLTQH 

FRLHTGEKPYSCKECGNAFRLQAELTRHHIVHTG 

EKP YKCKECGKAFS VN SELTRHHRIHTGEKP YQC 

KECGKAFniSDQLTLHQ\KIILVR\NPMHNVKRIR 

WPLENAL* QRICNLRNFLFVTEHVGIPFTSCSQFI 

RNYFVC 


3854 


A 


108 


894 


LQSCWVPGBPWPSVGWLSWLKDLPSCEIHSASLS 

AVLQGPQCSEMLWPKNLTSWDDSSSVSSGISDT1 

DNLSTDDINTS SSIS S Y ANTPAS SRKNLDVQTDAE 

KHSQVERNSLWSGDDVKKSDGGSDSGDCMEPGS 

KWRRWSDVSDESDKSTSGKKNPVISQTGSWRR 

GMTAQVGnMPRTKASAPAGALKTPGTGKRPGL 
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SEQTD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
(o last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCystejne, D~Aspartic Acid, 
&=Glutamic Add, ^Phenylalanine, G=Glydnc, H»Histidine, 
l^Isoieucine. K^Lysine. l^Leucine, M=Metbiooine, 
N^Asparagine, P=Proline, Q»Glutamine, R-Argininc, S-Serine, 
T-Threonine, V=»Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










S\GPGAPTPAAPPQLARMAWAFSLSAASTPAVSP 

STSPSAVEGSPATILPLASSPPPRTTP*LPLSELTV* 

RPQELVRGRGCLGPGAPTPAAPPQLARMAWAFS 

LSAASTPAVSPSTSPSAVEGSPATTLPLASSPPPRT 

TP 


3855 


A 


1 


772 


FRGGDG APG V LKPGNPLPFPLPPLQYPPPSTLSHS 

DNLAMTSRSTARPNGQPQASKICQFKLVLLGESA 

VGKSSLVLRFVKGQFHEYQESTIGAAFLTQSVCL 

DDTTVKFEIWDTAGQERYHSLAPMYYRGAQAAI 

VVYDITNQETFARAKTWVKELQRQASPVSIVVGL 

AGNKADLANKRMVEYEEAQAYADDNSLLFMET 

S AKTAMNVNDLFL\A1A *EV AKRVNPQNLG\G\A 

AGRSRGVDLHEQS\QQNKSQCCSN 


3856 


A 


2815 

• 


352 


LGLEAAARPRPGGPAAMQDGNFLLSALQPEAGV 

CSLALPSDLQLDRRGAEGPEAERLRAARVQEQV 

RARLLQLGQQPRHNGAAEPEPEAETARGTSRGQ 

YHTLQAGFSSRSQGLSGDKTSGFRPIAKPAYSPA 

SWSSRSAVDLSCSRRLSSAHNGGSAFGAAGYGG 

AQPTPPMPTRPVSFHERGGVGSRADYDTLSLRSL 

RLGPGGLDDRYSLVSEQLEPAATSTYRAFAYER 

QASSSSSRAGGLDWPEATEVSPSRTIRAPAVRTL 

QRFQSSHRSRGVGGAVPGAVLEPVARAPSVRSLS 

LSLADSGHLPDVHGFNSYGSHRTLQRLSSGFDDI 

DLPSAVKYLMASDPNLQVLGAAYIQHKCYSDAA 

AKKQARSLQAVPRLVKLFNHANQEVQRHATGA 

MRNLIYDNADNKLALVEENGIFELLRTLREQDDE 

LRKKVTGILWNLSSSDHLKDRLAKKTPLEVQLTVD 

LG V * APLSG AGGPPALIQQNASEAEIFYNATGFPR 

NLSSASQATRQKMRECHGLVDALVTSINHALDA 

GKCEDKSVENAVCVLRNLSYRLYDEMPPSALQR 

LEGRGRRDLAGAPPGEVVGCFTPQSRRLRELPLA 

ADALTFAEVSKDPKGLEWLWSPQIVGLYNRLLQ 

RCELKRHTTEAAAGALQNITGGXDPRGPGGLSRL 

ALEQERJDLNPLLDRVRTADHHQLRSLTGLIRNLS 

RNARNKDEMSTKVV\SHLI\EKLPGSVGEKSPPAE 

VLV\NI\IAVFNNLGWLASPVALARDLLYFDGLRK 

LIFIKKKRDSPDSEKSSRAASSLLAIsnLWQYNKLH 

RDFRAKGYRKEDFLGP 


3857 


A 


1034 


204 


VAVTIXSQLPSArQRTAAWEMRAPLTFRVPLALD 

LIKPEHCTVNVDNSLSIPVIAAELVVRKPSEKGM 

QQKJCKTKDLGFRAGKESKTEWRK*GLQDMASQ 

MFALPLK*PVTAAFHDSSMPSSLLQIEMEQLFLE 

ARLQ/PDSKSEARRNQCDSMLLRNQQLCSTCQE 

MKMVQPRTMKIPDDPKASFENCMSYRMSLHQP 

KFQTTPBPFHDD1K1ENIHLQNL/PILGPRTAVFHG 

LLTEAYKTLKERQRSSLPRKEPIGKTTEAVSGRSS 

SPPRLPERK 


3858 


A 


203 

» 


3469 


SHQEffiQNSAMAPRKRGGRGISFIFCCFRNNDHPE 

ITYRLRITOSNFAIXJTMEPAIJMPPVEELDVMFSE 

LVDELDLTDKHREAMFALPAEKXWQIYCSKKK 

DQEENKGATSWPEFYIDQLNSMAARKSLLALEK 

EEEEERSKT1ESLKTALRTKPMRFVTRFIDLDGLS 

CILNFLKTMDYETSESR1HTSLIGC1KALMNNSQG 

RAHVLAHSESINVIAQS LSTENIKTKVA VLEILG A 

VCLWGGHKKVLQAMLHYQPCYASERTRFQTLIN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A*»Alanine OCysteine, D=*Aspartic Acid, 
E-Clutamtc Acid, F-Phcnylalanine, OGlycine, H=Histidine, 
I=Lso leucine, K~ Lysine, L-=Leucine, M-Methiooine, 
N=Asparagine, P^-Proline, Q-Giutamine, R=>Arginine, S»Serine, 
T-Threonine, V-Valine, W-Tryptophan, Y-Tyrosine, 
X^Unknown, *-Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










DLDKSTGRYRDE V SLKTAIMSFINA VLSQG AGVE 

SLDFRLHLRYENFLMLGIHPVMDKLRKHENSTLD 

RHLDFFEMLRNEDELEFAKRPELVHIDTKSATQM 

FELTRKRLTHSEAYPHFMSDLHHCLQMPYKRSGN 

TVQYWLLLDRIIQQIVIQNDKGQDPDSTPLENFNI 

KNVVIcJS4LVNEhm\^QWKEQ 

QKLEiCKERECDAKTQEKEEMMQTLNKMKEKLE 

KETTEHKQVKQQVADLTAQLHELSRRAVCASIP 

GGPSPGAPGGPFPSSVPGSLLPPPPPPPLPGGMLPP 

PPPPLPPGGPPPPPGPPPLGAIMPPPGAPMGLALK 

KKSIPQPTNALKJSFNWSKLPENKLEGTVWTEIDD 

TKVFKILDLEDLERTFSAYQRQQDFFVNSNSKQK 

EADAIDDTLSSKLKVKELSVIDGRRAQNCNILLS 

RLKLSNDEKRAILTMDEQEDLPKDMLEQLLKFV 

PEKSDIDLLEEHKHELDRMAKADRFLFEMSRINH 

YQQRLQSLYFKKKFAERVAEVKPKVEAIRSGSEE 

\O^GALKQLLEVVIJU ? GNYMNKGQRGNAYGF 

KISSLNKIADTKSSIDKNITLLHYLnWENKYPSV 

LNLNEELRDIPQAAKVNMTELDKEISTLRSGLKA 

VETELEYQKSQPPQPGDKFVSWSQF1TVASFSFS 

DVEDLLAEAKDLFTKAVKHFGEEAGK1QPDEFF 

GIFDQFLQAVSEAJCQENENMRKKKEEEERRARM 

EAQLKEQRERERKMRKAKENSEESGEFDDLVSA 

LRSGBWDKDLSKLKKNRKRITNQMTD SSRERPI 

TKLNF 


3859 


A 


1279 


141 


RVEHLSEFLVDIKPSLTFDVIPLLDPYGPAGSDPS 

LEFLVVSEETYRGGMAINRFRJLENDLEELALYQI 

QLLKDLRHTENEEDKVSSSSFRQRMLGNLLRPPY 

ERPELPTCLYV1GLTGISGSGKSSIAQRLKGLGAF 

VIDSDHLGHRAYAPGGPAYQPWEAFGTDILHK 

DGII>«UCVLGSRWGNKKQLKILTDIMWPIIAKLA 

REEMDRAVAEGKRVCVIDAAVLLEAGWQNLVH 

EVWTAVIPETEAVRRIVERDGLSEAAAQSRLQSQ 

MSGQQLVEQSHWLSTVCGSRISPNARWRKPGPS 

CRSAFPRLIRPSTEKFSVGPDWLLELTSDPWRRN 

GGLDAHPGSGPEVQAILCRTWPGLVDTGSLPNTL 

VFGQH 


3860 


A 


1 


3881 

• 


MGQKSVGASYVQIPLVPPLSRHPKGLGHBDRWS 

SYCLSSLAAQNICTSKLHCPAAPEHTDPSEPRGSV 

SCCSLLRGLSSGWSSPLLPAPVO^PNKAIFTVDA 

KTTEDLVANDKACGLLGYSSQDUGQKLTQFFLR 

SDSDWEALSEEHMEADGHAAVVFGTWDnSRS 

GEKIPVSVWMKRMRQERRLCCVWLEPVERVST 

WVAFQ SDGTVTSCDSLF AHLHG YVSGEDV AGQ 

HITDLIPSVQLPPSGQHIPKNLKIQRSVGRARDGT 

TFPLSLKLKSQPSSEEATTGEAAPVSGYRASVWV 

FCraGLITLLPDGTIHGINHSFALTLFGYGKTELL ! 

GKNITFLIPGFYSYMDLAYNSSLQLPDLASCLDV 

GNESGCGERTLDPWQGQDPAEGGQDPRINWLA 

GGHVVPRDEIRKLMESQDIFTGTQTELIAGGQLL 

SCLSPQPAPGVDNVPEGSLPVHGEQALPKDQQ1T 

ALGREEPVAEBSPGQDLLGESRSEPVDVKPFASCE 

DSEAPVPAEDGGSDAGMCGLCQKAQLERMGVS 

GPSGSDLWAGAAVAKPQAKGQLAGGSLLMHCP 

CYGSEWGLWWRSQDLAPSPSGMAGLSFGTPTLD 



453 
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SEQID 
NO: 

• 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=*Alanine OCystdne, B=Asportic Acid, 
E»Glutamic Add, ^Phenylalanine, G=Glycine, H=HlstidIne, 
I^Isoleucine, K=Lysine, L= Leucine, M=Mctbionine, 
N«Asparaginc, F*=Pro!ioe, Q=GIuteminc, R-Argioine, S=Serioe» 
TVThreonine, V«=Valine, W^Tryptophao, Y=»TVrosIne, 
X=Unknown, # »Stop codon, A=possible nudeotide deletion, 
^possible nucleotide insertion 










EPWLGVENDREELQTCLIKEQLSQLSLAGALDVP 

HAELVPTECQAVTAPVSSCDLGGRDLCGGCTGS 

SSACYALATDLPGGLEAVEAQEVDVNSFSWNLK 

ELFFSDQTDQTSSNCSCATSELRETPSSLAVGSDP 

DVGSLQEQGSCVLDDRELLLLTGTCVDLGQGRR 

FRESCVGHDPTEPLEVCLVSSEHYAASDRESPGH 

VPSTLDAGPEDTCPSAEEPRLNVQVTSTPVIVMR 

GAAGLQREIQEGAYSGSCYHRDGLRLSIQFEVRR 

VELQGPTPLFCCWLVKDLLHSQRDSAARTRLFL 

ASLPGSTHSTAAELTGPSLVEVLRARPWFEEPPK 

AVELEGLAACEGEYSQKYSTMSPLGSGAFGFVW 

TAVDKEKNKEVWKF1KKEKVLEDCW1EDPKLG 

KVTLELAJLSRVEHANIDCVLDIFENQGFFQLVME 

KHGSGLDLFAFIDRHPRLDEPLASYIFRQVRAGXQ 

SRLVSAVGYLRLKDIIHRDIKDENIVIAEDFTIKLI 

DFGSAAYLERGKLFYTFCGTIEYCAPEVLMGNPY 

RGPELEMWSLGVTLYTLVFEENPFCELEETVEAA 

IHPPYLVSKELMSLVSGLLQPVPERRTTLEKLVT 

DPWVTQPVNLADYTWEEVFRVNKPESGVLSAAS 

LEMGNRSLSDVAQAQELCGGPVPGEAPNGQGCL 

HPGDPRLLTS 


3861 


A 


1 


3881 

• 


MGQKSVGASYVQIPLVPPLSRHPKGLGHEDRWS 

SYCLSSLAAQNICTSKLHCPAAPEHTDPSEPRGSV 

SCCSIXRGLSSGWSSPLLPAPVCNPNKAIFTVDA 

KTTEILVANDKACGLLGYSSQDLIGQKLTQFFLR 

SDSDWEALSEEHMEADGHAAVVFGTWDIISRS 

GEKIPVSVWMKRMRQERRLCCVVVLEPVERVST 

WVAFQSDGTVTSCDSLFAHLHGYVSGEDVAGQ 

HITDLIPSVQLPPSGQHIPKNLICIQRSVGRARDGT 

TFPLSLKLKSQP SSEEATTGEAAPVSG YRAS VWV 

FCTISGLITLLPDGTIHGINHSFALTLFGYGKTELL 

GKN1TFLIPGFYSYMDLAYNSSLQLPDLASCLDV 

GNESGCGERTLDPWQGQDPAEGGQDPRINWLA 

GGHWPRDEIRKLMESQDIFTGTQTELIAGGQLL 

SCLSPQPAPGVDNVPEGSLPVHGEQALPKDQQIT 

ALGREEPVAIESPGQDLLGESRSEPVDVKPFASCE 

DSEAPVPAEDGGSDAGMCGLCQKAQLERMGVS 

GPSGSDLWAGAAVAKPQAKGQLAGGSLLMHCP 

CYGSEWGLWWRSQDLAPSPSGMAGLSFGTPTLD 

EPWLGVENDKEELQTCLIKEQLSQLSLAGALDVP 

HAELVPTECQAVTAPVSSCDLGGRDLCGGCTGS 

SSACYALATDLPGGLEAVEAQEVDVNSFSWNLK 

ELFFSDQTDQTSSNCSCATSELRETPSSLAVGSDP 

DVGSLQEQGSCVLDDRELLLLTGTCVDLGQGRR 

FRESCVGHDPTEPLBVCLVSSEHYAASDRESPGH 

VPSTLDAGPEDTCPSAEEPRLNVQVTSTPVIVMR 

GAAGLQREIQEGAYSGSCYHRDGLRLSIQFEVRR 

VELQGPTPLFCCWLVKDLLHSQRDSAARTRLFL 

ASLPGSTHSTAAELTGPSLVEVLRARPWFEEPPK 

AVELEGLAACEGEYSQKYSTMSPLGSGAFGFVW | 

TAVDKEKNKEVVVKFIKKEKVLEDCWIEDPKLG 

KVTLEIAII^RVEHANIIKVLDIFENQGFFQLVME 

KHGSGLDLFAFIDRHPRLDEPLA S YIFRQ VRAG\Q 

SRLVSAVGYLRLKDIIHRDIKI^ENIVIAEDFTriKLI 

DFGSAAYLERGKLFYTFCGTIEYCAPEVLMGNPY 
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I SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of. 
peptide 
sequence 


Amino acid sequence (A«Alanlne OCysteloe, D=Aspartic Acid, 
E=Glutamic Acid, F-Phenylatauine, G=€lycine, H^Histidinc, 
Islsoleucine, K-Lyslne, l^Leudne, M=Metbionioe, 
N»Asparagine, P»Prollne, Q=G\utamlne, R»Arginine, S=Serine, 
T^Tnreonlne, V=Va!ine, W^ryptophan, Y=Tyroslne, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 










RGPELEMWSLGVTLYTLVFEENPFCELEETVEAA 
IHPPYLVSKEI^SLVSGLLQPWERRTTLEKLVT 
DPWVTQPVNLADYTWEEVFRVNKPESGVLSAAS 
LEMGNRSLSDV AQAQELCGGPVPGEAPNGQG CL 
HPGDPRLLTS 


3862 

♦ 


A 


399 


2069 


TMDRSKKNSIAG FPPRVENRLEEFEGGGGGEGNV 

SQVGRV WPSSYRALIS AFFRLTRLDDFTCEKIGS G 

rTSEVrTCVRHRASGQVMALKMNTLSSNRANML 

KJBVQLMNRLSHPNILRYINSGNLEQLLDSNLHLP 

WTVRVKLAYDIAVGLSYLHFKGDFHRDLTSKNC 

LIKRDENGYSAWADFGLAEKIPDVSMGSEKJLA 

VVGSPFWMAPEVLRDEPYNBKADVFSYGIILCEII 

ARIQADPDYLPRTENFGLDYDAFQHMVGDCPPD 

FLQLTTOCCNMDPKLRPSFVEIGKTLEEILSRLQE 

EEQERDRKLQPTARGLLEKAPGVKRLSSLDDKIP 

HKSPCPRRTIWLSRSQSDIFSRKPPRTVSVLDPYY 

RPRDGAARTPKVNPFSARQDLMGGKIKFFDLPSK 

SVISLVFDLDAPGPOTMPLADWQEPLAPPIRRWR 

SLPGSPEFLHQEACPFVGREESLSDGPPPRLSSLK 

YRVKEIPPFRASALPAAQAHEAMDCSILQEENGF 

GSRPQGTSPCPAGASEEMEVEERPAGSTPATFSTS 

GIGLQTQGKQDG 


3863 


A 


399 


2069 


TMDRSKRNSIAGFPPRVENRLEEFEGGGGGEGNV 

SQVGRVWPSSYRAUSAFFRLTOLDDFTCEKIGSG 

FFSEVFKVRHRASGQVMALKMNTLSSNRANML 

KE VQLMNRLS HPNILRYINSGNLEQLLDSNLHLP 

WTVRVKLAYDIAVGLSYLHFKGIFHRDLTSKNC 

LIKRDENGYSAWADFGLAEKIPDVSMGSEKLA 

VVGSPFWMAPEVLRDEPYNEKADVFSYGIILCEII 

ARIQADPDYLPRTENFGLDYDAFQHMVGDCPPD 

FLQLTFNCCNMDPKLRPSFVEIGKTLEEILSRLQE 

EEQERDRKLQPTARGLLJEKAPGVKRLSSLDDK1P 

HKSPCPRRTIWLSRSQSDIFSRKPPRTVSVLDPYY 

RPRDGAARTPKVNPFSARQDLMGGKIKiTDLPSK 

SVISLVFDLDAPGPGTMPLADWQEPLAPPIRRWR 

SLPGSPEFLHQEACPFVGREESLSDGPPPRLSSLK 

YRVKEIPPFRASALPAAQAHEAMDCSILQEENGF 

GSRPQGTSPCPAGASEEMEVEERPAGSTPATFSTS 

GIGLQTQGKQDG 


3864 


A 


3 


911 


SWNMDSDSCAAAFHPEEYSPSCKRRRTVEDFNK 

FCTFVLAYAGYIPYPKEELPLRSSPSPANSTAGTI 

DSDGWDA GFSDIASS VPLPVSDRCFSHLQPTLLQ 

RAKPSNFLLDRKKTDKLICKKKKRKRRDSDAPGK 

EGYRGGLLKJLEAADPYVETPTSPTLQDIPQAPSD 

PCSGWDSDTPSSGSCATVSPDQVKEIKTEGKRTI 

VR/QE AQLMAKNDGNFS SLLESIFPSVDDDS WDLV 

TCTCMKPFAGRPMHBCNECHTWIHLSCAKIRKSN 

VPEVFVCQKCRDSKFDIRRSNRSRTGSRKLFLD 


3865 . 


A 


3 


3573 


QERLRSRSRPDRAAREAGSARGRQPKRTERVEQ 

FLT1ARRRGRRSMPVSLEDSGEPTSCPATDAETAS 

EGSVESASETRSGPQSASTAVKERPASSEKVKGG 

DDHDDTSDSDSDGLTLKELQNRLRRKREQEPTE 

RPLKGIQSRLRKKRREEGPAETVGSEASDTVEGV 

LPSKQEPENDQGWSQAGKDDRESKLEGKAAQD 

DCDEEPGDLGRPKPECEGYDPNALYCICRQPHNN 
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I SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«Alanlne C=Cysteine, D=>Aspartic AddJ 
E^GIutamic Add, F^Pbenylalanine, G= Glycine, H-Histidine, 
I^lsoleucine, K-Lysine, L=Lcurinc, M^Methioninc, 
NsAsparagine, P^Proline, Q=Gluta mine, R<=Arginine, S=Scrinc, 
T-Threonine, V«Vallne, W^ryptophan, Y«Tyrosine, 
X=Unknown, *<=Stop codon, /^possible nudeotide deletion, 
^possible nudeotide insertion 






• 


• 


RFMICCDRCEEWFHGDCVGISEARGRLLERNGE 

DYICPNCTBLQVQDETHSETADQQEAKWRPGDA 

DGTDCTSIGTIEQKSSEDQGIKGRIEKAANPSGKK 

KJLKIFQPGPG P VPTQLPVL WQ VLEIAV SRSISAFT 

LLHCISCKVIEAPGASKCIGPGCCHVAQPDSVYCS 

M3CILKHAAATMKFLSSGKEQKPKPKEKMKMK 

PEKPSLPKCGAQAGIKISSVHK31PAPEKKETTVK 

KAWVPARSEALGKEAACESSTPSWASDHNYNA 

VKPEKTAAPSPSLLYKSTKEDRRSEEKAAATAAS 

KKTAPPGSTVGKQPAPRNLVPKKSSFANVAAAT 

PAIKKPPSGFKGTIPKRPWLSATPSSGASAARQAG 

PAPAA ATAA SKKFPGSAAL VGA VRKP WPS VPM 

ASPAPGRLGAMSAAPSQPNSQIRQNIKRSLKEIL 

WK/RFIJTILFRVKDSDDLIMTENEVGKIALHIEK 

EMFNLFQVTDN/RAYKSKYRSIMFNLKDPKNQG 

LFHRVLREEISLAKLVRLKPEELVSKELSTWKER 

PARSVMESRTKLHNESKKTAPRQEAIPDLEDSPP 

VSDSEEQQESARAVPEKSTAPLLDVFSSMLKDTT 

SOHRAHLFDLNCKICTGOVPSAEDEPAPKKQKLS 

ASVKKEDLKSKHDSSAPDPAPDSADEVMPEAVP 

EVASEPGLESASHPNVDRTYFPGPPGDGHPEPSPL 

EDLSPCPASCGSGVVTTVTVSGRDPRTAPSSSCT 

AVASAASRPDSTHMVEARQDVPKPVLTSVMVPK 

SILAKPSSSPDPRYLSVPPSPNISTSESRSPPEGDTT 

LFLSRLSTTWKGFTNMQSVAKFVTKAYPVSGCFD 

YLSEDLPDTJMGGRIAPKTVWDYVGKLKSSVSK 

ELCLIRFHPATEEEEVAYISLYSYFSSRGRFGVVA 

NNNRHVKDLYLIPLSAQDPVPSKLLPFEGPGKRR 

LSGWR 


3866 


A 


2 


3181 


AQQPVGRRGGASGAGGGRRGTPRPRAGAGPGF 

QVSSGGCRLSKMRRFLRPGHDPVRERLKRDLFQ 

FNKTVEHGFPHQPSALGYSPSLRJLAIGTRSGAIK 

LYGAPGVEFMGLHQENNAVTQIHLLPGQCQLVT 

LLDDNSLHLWSLKVKGGASELQEDESFTLRGPP 

GAAPSATQITVVLPHSSCELLYLGTESGNVFVVQ 

LPAFRALEDRTISSDAVLQRLPEEAJRHRRVFEMV 

EALQEHPRDPNQILIGYSRGLVVrWDLQGSRVLY 

HFLSSQQLENIWWQRDGRLLVSCHSDGSYCQWP 

VSSEAQQPEPLRSLVPYGPFPCKAITRILWLTTRQ 

GNLPFTIFQGGMPRASYGDRHCISVIHDGQQTAFD 

FTSRVIGFTVLTEADPAATFDDPYALWLAEEEL 

WIDLQTAGWPPVQLPYLASLHCSAITCSHHVSN 

IPLKLWERI1AAGSRQNAHFSTMEWPIDGGTSLTP 

APPQRDLLLTGHEDGTVRFWDASGVCLRLLYKL 

STVRVFLTDTDPNENLS AQGEDEWPPLRKVG SF 

DPYSDDPRLGIQKIFLCKYSGYLAVAGTAGQVLV 

T ,ET -NDEAAEQA VEQ VEADLLQD QEG YRWKGHE 

RLAARSGPVRFEPGFQPFVLVQCQPPAWTSLAL 

HSEWRLVAFGTSHGFGLFDHQQRRQVFVKCTLH 

PSDQLALEGPLSRVKSLKKSLRQSFRRMRRSRVS 

SRKRHPAGPPGEAQEGSAKAERPGLQNMELAPV 

QRKIEARSAEDSFTGFVRTLYFADTYLKDSSRHC 

PSLWAGTNGGTIYAFSLRVPPAERRMDEPVRAE 

QAKE1QLMHRAPWGILVLDGHSVPLPEPLEVAH 

DLSKSPDMQGSHQLLWSEEQFKVFTLPKVSAK 



456 
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m£qid t Method 
NO: 



Amino add sequence (A=Alanine OCysteine, D»Aspartic Add, 
E»Glutamic Add, ^Phenylalanine, G=Clycine, H°Hisddine, 
I B IsoIeudne, K»Lysine, L^Leudne, M-Methionine, 
N^Asparagine, psprollne, Q=Glutamine, R=Arglnine, S=Sertne, 
T«Thrrooine, V«Valioe, W«Tryptophan, Y«Tyrosine, 
X-Unknown, *=*Stop codon, ^possible nucleotide deletion, 
V=possibU nudcotide insertion 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nudeotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 



LKLKLTALEGSRVRRVSVAHFGSRRAEDYGEHH 
LAVLTNLGDIQVVSLPLLKPQVRYSCIRREDVSGI 
ASCVFTKYGQGFYLISPSEFERFSLSTKGVLVEPRC 
LVDSAETKNHRPGNGAGPKKAPSRARNSGTQSD 
GEEKQPGLVMERALLSDERAATGWHEEPPWGA 
ASAMAEQSEWLSVQAAR 



3867 



3181 



3868 



1 



AQQPVGRRGGASGAGGGRRGTPRPRAGAGPGF 

QVSSGGCRLSKMRRFLRPGHDPVRERLKRDLFQ 

FNKTVEHGFPHQPSALGYSPSLRILAIGTRSGADC 

LYGAPGVEFMGLHQENNAVTQIHLLPGQCQLVT 

LLDDNSLHLWSLKVKGGASELQEDESFTLRGPP 

GAAPSATQITWLPHSSCELLYLGTESGNVFWQ 

LPAFRALEDRTISSDAVLQRLPEEARHRRVFEMV 

EALQEHPRDPNQILIGYSRGLWIWDLQGSRVLY 

HFLSSQQLENIWWQRDGRLLVSCHSDGSYCQWP 

VSSEAQQPEPLRSLVPYGPFPCKArTRELWLTTRQ 

GNLPFTBFQGGMPRASYGDRHCISVIHDGQQTAFD 

FTSR VIGFTVLTE ADP AA TFDDP YAL WL AEEEL 

WIDLQTAGWPPVQLPYLASLHCSAITCSHHVSN 

IPLKLWERIIAAGSRQNAHFSTMEWPIDGGTSLTP 

APPQRDLLLTGHEDGTVRFWDASGVCLRLLYKL 

STVRVFLTDTDPNENLSAQGEDEWPPLRKVGSF 

DPYSDDPRLGIQKIFLCKYSGYLAVAGTAGQVLV 

LELNDEAAEQAVEQVEADLLQDQEGYRWKGHE 

RLAARSGPVRFEPGFQPFVLVQCQPPAVVTSLAJL 

HSEWRLVAFGTSHGFGLFDHQQRRQVFVKCTLH 

PSDQLALEGPLSRVKSLKKSLRQSFRRMRRSRVS 

SRKRHPAGPPGEAQEGSAKAERPGLQNMELAPV 

QRKIEARSAEDSFTGFVRTLYFADTYLKDSSRHC 

PSLWAGTNGGTIYAFSLRVPPAERRMDEPVRAE 

QAKEIQLMHRAPWGILVLDGHSVPLPEPLEVAH 

DLSKSPDMQGSHQLLWSEEQFKVFTLPKVSAK 

LKLKLTALEGSRVRRVSVAHFGSRRAEDYGEHH 

LAVLTNLGDIQVVSLPLLKPQVRYSCERREDVSGI 

ASCVFTKYGQGFYLISPSEFERFSLSTKGVLVEPRC 

LVDSAETKNHRPGNGAGPKKAPSRARNSGTQSD 

GEEKQPGLVMERALLSDERAATGWHIEPPWGA 

ASAMAEQSEWLSVQAAR 



2497 



GDSGGPLVCEEPSGRFFLAGIVSWGIGCAEARRP 

GVY ARVTRLRD WILE ATTKA SMPLAPTMAPAP A 

APSTAWPTSPESPWSTPTKSMQALSTVPLDWVT 

VPKLQECG ARPAMEKPTRWG GFGAASGEVPW 

QVSLKEGSRHFCGATWGDRWLLSAAHCFNHT 

KVEQVRAHLGTASLLGLGGSPVKIGLRRWLHP 

LYNPGIIJDFDIAV1JBLASPUVFNKYIQPVCLPLAI 

QKFP VGRKCMISG WGNTQEGN ATKPELLQKAS V 

GIIDQKTCSVLYNFSLTDRMICAGFLEGKVDSCQ 

VSGIKALYESELADARRVLDETARERARLQIEIG 

KXRAELDEVNKSAJCKREGELTVAQGRVKDLESL 

FHRSEVELAAALSDKRGLESDVAELRAQLAKAE 

DGHAVAKKQLEKETLMRVDLENRCQSLQEELDF 

RKSVFEEEVRETRRRHERRjLVEVDSSRQQEYDFK 

MAQALEELRSQHDEQVRLYKLELEQTYQAKLDS 

AJKLSSDQNDKAASAAREELKEARMRLESLSYQL 

SGLQKQASAAEDRIRELEEAMAGERDKFRKMLD 
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SEQID 
NO: 


Method 


| Predicted 
1 beginning 
I nucleotide 
J location 
1 corresponding 
f to first amino 
I acid residue of 
1 peptide 
sequence 


Predicted end 

nucleotide 

location 
1 corresponding 
1 to last amino 
1 acid residue of 
1 peptide 
( sequence 

1 ■ 


Amino add sequence (A^AIanine OCysteinc, D=Aspartic Acid, 
E=Glutaraic Acid, ^Phenylalanine, G==Glycine, H-Histidlne, 
Islsoleocine, K«Lysioe, L^Lcudne, M-Methionine, 
N»Asparagine, P»Prolinc, Q-Glutamine, R-Arginine f S=Serine, 
T=Threonine, V^Vallne, W^Tryptophan, Y^Tyroslne, 
X=Unknown, *=*Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










AKEQEMTEMRDVMQQQLAEYQELLDVKLALD 

MEINAYRKLLEGEEERLKLSPSPSSRVTVSRATSS 

SSGSLSATGRLGRSKRKR\WRWRSPW\QRPKRPG 

HGHGWQRWLPPGPAGLGLGQR\HIEE1DLEGKFV 

QLKNNSDKDQSLGNWRJDCRQVLEGEEIAYKFTP 

KYILRAGQMVTVWAAGAGVAHSPPSTLVWKGQ 

SSWGTGESFRTVLVNADGEEVAMRTVKKSSVM 

RENENGEEEEEEAEFGEEDLFHQQGDPRTTSRGC 

YVM 


3869 


A 


1 


| 1942 


RYRAGIPGDGRKDYIRLTRPGLTLPGRAMFARGS 

RRRRSGRAPPEAEDPDRGQPCNSCREQCPGFLLH 

GWRKICQHCKCPREEHAVHAVPVDLERIMCRLIS 

DFQRHSISDDDSGCASEEYAWVPPGLKPEQVYQ 

FFSCLPEDKVPYVNSPGEKYRDCQLLHQLPPHDS 

EAQYCTALVEE\EEKKELRAFSQQRKRENLG/RLG 

IVRIFPVTIT\GAI\CEECGKQ1GGGDIAVF\ASRASL 

GLLLGQPSCnVCTTCQELLVDLIYFYHVGKVYC 

GRHHAECLRPRCQACDEIIFSPECTEAEGRHWHM 

DHFCCFECEASLGGQRYVMRQSRPHCCACYEAR 

HAEYCDGCGEHIGLDQGQMAYEGQHWHASDRC 

FCCSRCGRALLGRPFLPRRGLIFCSRACSLGSEPT 

APGPSRRSWSAGPVTAPLAASTASFSAVKGASET 

TTKGTSTELAPATGPEEPSRFLRGAPHRHSMPEL 

GLRSVPEPPPESPGQPNLRPDDSAFGRQSTPRVSF 

RDPLVSEGGPRRTLSAPPAQRRRPRSPPPRAPSRR 

RHHHHNHHHHHNRHPSRRRHYQCDAGSGSDSE 

SCSSSPSSSSSESSEDDGFFLGERIPLPPHLCRPMP 

AQDTAMETFNSPSLSLPRDSRAGMPRQARDKNC 

IVA 


| 3870 | 


A j 

• 


2 


3485 


F V WRVF Y VHASCMPPRARS WEG AHAP VGMHV 

AEAHACSSQQQQMPPAQFWMLEWLLHLCAFLS 

TPSFPHWCCCSNPHGSIADKPEEIVPASKPSRAAE 

NMAVEPRVATKQRPSSRCFPAGSDMNSVYERQ 

GIAVMTPTWGSPKAPFLGIPRGTMRRQKSIDSRI 

FLSGrTEEERQFLAPPMLKFTRSLSMPDTSEDIPPP 

PQSWPSPPPPSPTTYNCPKSPTPRVYGTCKPAFNQ 

NSAAKVSPATRSDTVATMMREKGMYFRRELDR 

YSLDSEDLYSRNAGPQANFRNKJttjQMPENPYSE 

VGKIASKAVYVPAKPARRKGMLVKQSNVEDSPE 

KTCSIPIPTIIVKEPSTSSSGKSSQGSSMEIDPQAPE 

PPSQLRPDESLTVSSPFAAAIAGAVRDREKRLEA 

RRNSPAFLSADLGDEHVGLGPPAPRTRPSMFPEE 

GDFADEDSAEQLSSPMPSATPREPENHFVGGAEA 

SAPGEAGRPLNSTSKAQGPESSPAVPSASSGTAG 

PGNYVHPLTGRLLDPSSPLALALSARDRAMKES 

QQGPKGEAPKADLNKPLYIDTKMRPSLDAGFPT 

VTRQNTRGPLRRQETENKYETDLGRDRKGDDK 

KNMLIDIMDTSQQKSAGLLMVHTVDATKLDNA 

LQEEDEKAEVEMKPDS SPSEVPEG VSETEG ALQI 

S AAPEPTTVPGRTI V A VG SMEE A VILPFRIPPPPLA 

SVDLDEDFIFTEPLPPPLEFANSFDIPDDRAASVPA 

LSDLVKQKKSDTPQSPSLNSSQPTNSADSKKPAS 

LSNCLPASFLPPPESFDAVADSGIEEVDSRSSSDH 

HLETTSTISTVSSISTLSSEGGENVDTCTVYADGQ j 

AFMVDKPPVPPKPKMKPIMKSNALYQDALVEE 
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SEQW 
NO; 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AJanlne OCysteine, D=Aapartic Acid, 
£>Glutamic Acid, ^Phenylalanine, C=Glycine, H-Histidinc, 
h=lsolcucine, K°Lysine f L^Leucine, M«Methioninc, 
N=Asparagjne, P** Proline, Q^Glotamine, R«Argininc, S=Serine, 
T=Threonlne, V=VaIlne, W^Tryptophan, Y«Tyrosine, 
X=Unknown, *«Stop codon, /^possible nucleotide deletion, 
impossible nucleotide insertion 










DVDSFVIPPPAPPPPPGSAQPGMAKVLQPRTSKL 

WGDVTEIKSPILSGPKANVISELNSILQQMNREKL 

AKPGEGLDSPMGAKSASLAPRSPEIMSTISGTRST 

TV'l'FL'VRPGTSQPITLQSRPPDYESRTSGTRRAPS 

PWSFI'EMNKETLPAPLSAATASPSPALSDVFSLP 

SQPPSGDLFGLNPAGRSRSPSPSILQQPISNKPFTT 

KPVHLWTKPDVADWLESLNLGEHKEAJFMDNEI 

DGSHLPNLQKEDL IDLG VTRVGHRMNIERALKQ 

LLDR 


3871 


A 


35 


1171 


VESRSAWHEGEDQIDRLDFIRNQMNLLTLDVKK 

KIKEVTEEVANKVSCAMTDEICRLSVLVDEFCSE 

FHPWDVLKIYKSELNKHIEDGMGRNLADRCTD 

EVNALVLQTQQEHENLKPLLPAGIQDKLHTLIPC 

KJO^LSYNLNYHKLCSDFQEDIVFRFSLGWSSLV 

HRFLGPRNAQRVLLGLSEPIFQLPRSLASi Fl APT 

TPATPDNASQEELMITLVTGLASVTSRTSMGIIIV 

GGVIWKT1GWKLLSVSLTMYGALYLYERLSWTT 

HAKERAFKQQFVNYATEKLRMIVSSTSANCSHQ 

VKQQIATTFARLCQQVDITQKQLEEEIARLPKEID 

QLEKIQNNSKLLRNKAVQLENET ,ENFTKQFLPSS 

NEES 


3872 

• 


A 


35 


1171 


VESRSAWHEGEDQIDRLDFIRNQMNLLTLDVKK 

KIKEVTEEVANKVSCAMTDEICRLSVLVDEFCSE 

FHPWDVLKIYKSELNKHffiDGMGRNLADRCTD 

EVNALVLQTQQEIIENLKPLLPAGIQDKLHTLIPC 

KKFDLSYNLNYHKLCSDFQEDIVFRFSLGWSSLV 

HRFLGPRNAQRVLLGLSEPIFQLPRSLASTPTAPT 

TPAIPDNASQEELMITLVTGLASVTSRTSMGIHV 

GGVIWKTIGWKLLSVSLTMYGALYLYERLSWTT 

HAKERAFKQQFVNYATEKLRMTVSSTSANCSHQ 

VKQQIA'1"1FARLCQQVDITQKQLEEE1ARLPKEID 

QI^KIQNNSKLLJUsfKAVQLENELENFTKQFLPSS 

NEES 


3873 


A 


2944 


2089 

* 


PVCTALTPGRMTDDKDVLRDVWFGRIPTCFTLY 

QDEITEREAEPYYLLLPRVSYLTLVTDKVKKHFQ 

KVMRQEDISErV^YEGTPLKWHYPIGLLFDLLA 

SSSAIJPWNTTVHFKSFPEKDLLHCPSKDAIEAHF 

MSCMKEADA1JKHKSQVINEMQKKBHKQLWMG 

LQNDRFDQFWAINRKLMEYPAEENGFRYIPFRIY 

QTTTERPFIQKLFRPVAADGQLHTLGDLLKEVCP 

SAIDPEDGEKKNQVMfflGIEPMLETPLQWLSEHL 

SYPDNFLHTSIIPQPTD 


3874 


A 


776 


366 


QARGAPSSPMCPLPLAAAAVAAPRAPLRLLNRG 

LAAAMSTAQSLKSVDYEVFGRVQGVCFRMYTE 

DEARKIGWGWVKNTSKGTVTGQVQGPEDKVN 

SMKSWLSKVGSPSSRIDRTNFSNEKTISiCLEYSNF 

SIRY 


3875 


A 


1081 


182 


SLSSCQTDPRPMSAPLDAALHALQEEQARLKMR 

LWDLQQLRKELGDSPKDKVPFSVPKIPLVFRGHT 

QQDPEVPKSLVSNLRIHCPLLAGSALITFDDPKVA 

EQVLQQKErmNMEECRLRVQVQPLEl^MVTTIQ 

VMVSSQLSGRRVLVTGFPASLRLSEEELLDKLEIF 

FGKTRNGGGDVDVRELLPGSVMLGFARDGVAQ 

RLCQIGQFTVPLGGQQVPLRVSPYVNGEIQKAEI 

RSQPVPRSVLVLNIPDILDGPELHDVLEIHFQKPT 
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SEQ ID 

NO: 


Method 

* 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D»Aspartic Acid, 
&=Glutaraic Acid, F«Phenylalanine, OGIycioe, HHHbtidine, 
I*=Isoleucine, K« Lysine, L^Leucine, M^Methionlne, 
N°Aspnragine, P=ProHne, Q=>Glutaminc, R^Arglnine, S^Serine, 
T>Threonine, V»Valine, W^Tryptophon, Y=Tyrosine, 
X^Unknown, *«Stop codon, /^possible nucleotide deletion, 
V-possible nucleotide insertion 










RGGGEVEALTVVPQGQQGLAVFTSESG 


3876 


A 


26 


431 


RMMKCPQALLAIFWLLLSWVSSEDKWQSPLSL 
VAmEGDTVTLNCSYEVTNFRSLLWYKQEKKAPT 
FLFMLTSSGIEKKSGRLSS1LDKKELSSILNITATQ 
TGDSAIYLCAVEAQCSLVTCSLYSNSTAEALQL 


3877 

• 


A 


3 


1291 


KAFRLLAERGAAAAMLWSGCRRFGARLGCLPG 

GLRVLVQTGHRSLTSCIDPSMGLNEEQKEFQKV 

AFDF AAREMAPNMAEWD QKELFP VD VMRKAA 

QLGFGGVYIQTDVGGSGLSRLDTSVIFEALATGC 

TSTTAYISMNMCAWMIDSFGNEEQRHKFCPPLC 

TMEKFASYCLTEPGSGSDAASLLTSAKKQGDHYI 

LNGSKAFISGAGESDIYVVMCRTGGPGPKGISCIV 

VEKGTPGl^FGKKEKKVGWNSQPTRAVIFEDCA 

VPVANRIGSEGQGFLIAVRGLNGGRINIASCSLGA 

AHASVILTRDHLNVRKQFGEPLASNQYLQFTLA 

DMATRLVAARLMVRNAAVALQEERKDAVALCS 

MAKLFATDECFAICNQALQMHGGYGYLKDYAV 

QQYVRDSRVHQELEGSNEVMRILISRSLLQE 


3878 


A 


10 


1014 


LPGSTISSSGCQAPGRADSSGGARKSRRGDSRPG 

SCNRQA VAPPCPSPGPQ SRHWIHRGTAPQ AGETR 

TLGRGSSAPNACSASVTPCCPSSPPS*SCL*PTRRS 

PQNSSSTEVYRGFWQHGLPST**PFSS*QWPGQH 

TQGCSKLLGKQTTHLPCSTWPA**PSPSCLTRFR* 

W*PSLMCLWASSCSVCV*SPSGSCRH»LWGTHST 

SRTC*ARRSSALPTGLCTDDTSWASSSKARPCAL 

QRPSSLSSLSPCLTC*W*LSSSSPMSARSPAGAET 

GSWATGSPRLTQWKSSRLTSTSHSARSAWKPSA 

TESTPSWPRFSSWTSGEDPASPAPAI 


3879 


A 


200 


699 


LLLTGYIQTLQNQQLSGNQQEMQAVDNLTSAPG 

NTSLC1RDYKITQVLFPLLYTVLFFVGUTNGLA 

MRlFFQXRSKSNFIIFLKhnVISDLLMELTFPFKILS 

DAKLGTGPLRTFVCQVTSVIFYFTMYISISFLGLIT 

IDRYQKTTRPFKTSNPKNIXGAKILK 


3880 


A 


26 


169 


QPETDTMVHLTPEEKSAVTALWGKVNVDEDAG 
DDLCQILVDRPRLRI 


3881 


A 


37 


1100 


TPLFDFWPGFVLSWLQPLSASLRARRAASGPPAC 

RIMPri'VDDVLEHGGEFHFFQKQMFFLLALLSAT 

FAPIYVGIVFLGFTPDHRCRSPGVAELSLRCGWSP 

AEELNYTVPGPGPAGEASPRQCRRYEVDWNQST 

FDCVDPLASLDTNRSRLPLGPCRDGWVYETPGSS 

IVTEFNLVCANSWMLDLFQSSVNVGFFIGSMSIG 

Y1ADRFGRKLCLLTTVLINAAAGVLMAISPTYTW 

MLIFRLIQGLVSKAGWLIGYILITEFVGRRYRRTV 

GIFYQVAYTVG1XVLAGVAYALPHWRWLQFTV 

ALPNFFFLLYYWCDPESPRWLISQNKNAEANOUIK 

HIAKKNGKSLPASL 


3882 


A 


573 


1620 

* 


KSKCRFPEGLSEGFGPMRKEALSSGSVQEAEAM 

LDEPQEQAEGSLTVYVISEHSSLLPQDMMSYIGP 

KRTAWRGIMHREAFNIIGRRIVQVAQAMSLTED 

VLAAALADHLPEDKWSAEKRRPLKSSLGYEITFS 

LLNPDPKSHDVYWDIEGAVRRYVQPFLNALGAA 

GNFSVDSQILYYAMLGVNPRFDSASSSYYLDMH 

SLPHVINPVESRLGSSAASLYPVLNFLLYVPELAH 

SPLYIQDKDGAPVATNAFHSPRWGGIMVYNVDS 

KTYNASVLPVRYE\^MVRVMEVFLAQLRLXFGI 
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| SEQID 
NO: 


1 Method 


1 Predicted 
j beginning 

nucleotide 
1 location 

corresponding 
1 to first amino 
I odd residue of 

peptide 
1 sequence 


j Predicted end 
I nucleotide 

location 

corresponding 
| to last amino 
I acid residue or 

peptide 

sequence 


Amino acid sequence (A-Alanine OCysteinc, D^Aspartlc Add, 
E«Glutamic Acid, ^Phenylalanine, G^GIydne, H»Histidiue, 
I-*Isoleucine, K B Lysine, L»Leudne, M»Methionlne, 
N 3 Asparagine, P*Proline, Q=Glutaminc, R»Arginine, S=Serine, 
•^Threonine, V«Vallne, W«Tryptopban, Y-Tyrosine, 
X^Unknown, *«Stop codoo, /-possible nucleotide deletion, 
^possible nudeotidc insertion 










AQPQLPPKCLLSGPTSEGLMTWELDRLLWARSV 
ENLATATITLTSLA 


3883 


A 


2369 


844 


RmREEDFQFILKGIARLl^NPLLQTYU>NSTKKIQ 
FHQELLVLFWKLCDFNKVGQPRGALQGDGEQLP 
Q*PGGRDSVRLRGVGQSCPSLELSPLGPSPHP*KF 
LJTVLKSSDVLDILVPILFFLNDARADQSRVGLM 

mGWILLIXSGECOTGVRLNKPYSIRVPMDIPVF 

TGTHADLLIVXVFHKUTSGHQRLQPLFDCLLTTVV 

NVSPYLKSLSMVTANKLLHLLEAFSTTWFLFSAA 

QNHHLVFFLLEVF^IIQYQFDGNSNLVYAIIRKR 

SIFHQLANLPTDPPT1HKALQRRRRTPEPLSRTGS 

QGGAPPWRAPAPLPLQSQAPSRPVWWLLQALTS 

♦PRSPRCQRMAPCGPWNLSPSRAWRMAARLRGS 

PARHGGSSGDRP/HSSASGQWSPTPEWVLSWKS 

KLPLQTIMRLLQVLVPQVEKICIDKGLTDESEILR 

FLQHGTLVGLLPVPHP1LIRKYQANSGTAMWFRT 

YMWGVTrlJlNVDPPVWYDTDVKLFEIQRV 


3884 


A 


1 


804 


NGPRAPFSQEGQSTGPPPL1PRLGQHGAQGRIPPL 

NPGQGPGPNKDDSRGPPNrfflMGPMSERRHEQSG 

GPEHGPERGPLRGGQDCRGPPDRRGPHPDFPDDF 

SRPDDFHPDKRFGHRLREFEGRGGPLPQEEKWR 

RGGPGPPFPPDHREFSEGDGRGAARGPPGAWEG 

RRPGG*TFPPGSRGPTFS/SGAEEESFRRGAPPRHE 

GRAPPRGRDGFPGPEDFGPEENFDASEEAARGRD 

LRGRGRGTPRGERVTKDTWS GRIGCRIHWL 


3885 


A 


3 


996 


GRRRAGPAHSARMYNMMETELKPPGPQQTSGG 

GGGNSTAAAAGGNQKNSPDRVKRPMNAFMVW 

SRGQRRKN1AQENPKMHNSEISKKLGAEWKLLSE 

TEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTK 

TLMKKDKYTLPGGLLAPGGNSMASGVGVGAGL 

GAGVNQRMDSYAHMNGWSNGSYSMMQDQLG 

YPQHPGLNAHGAAQMQPMHRYDVSALQYNSM | 

TSSQTYMNG/SRPTYSMSYSQQGTPGMAPGSXMG 

SWKSEASSSPPWTSSSHSRAPCQAGDLRDMIS 

MYLPGAEVPEPAAPSRLHMSQHYQSGPVPGTAI 

NGTLPLSHM 


3886 


A 


773 


317 


QCTQKAAEGYTQFYYVDVLDGKLACV2SIKCTKG 
TKSQMNCmGTCQLQRSGPRCLCPKTOTHWYW 
GETCEFNIAKSLVYGrVGAVMAVLLLALnLIILFS 
LSQVRKRHRPESEGEADFGLENATNNFGVPTLETV 
DSGTELHIQ\RPEMVASTV 


3887 


A 


3 


466 


VDFRVKTLLVDNKCFVLQLWDTAGQERYHSMT 

RQLLRKADGWLMYDITSQESFAHVRYWLDCL 

QDAGSDGVVILLLGNKMDCEEERQVSVEAGQQL 

AQELGVYFGECSAALGHNILEPWNLARSLRMQ 

EEGLKDSLVKVAPKRPPKRFGCCS 


3888 


A 


3412 


3144 


QNID1TNFSSS WNDGLAFCALLHTYLPAHIPYQEL 

NSQDKRRNFMLAFQAAESVGIKSTLDINEMVRT 

ERPDWQNVMLYVTAIYKYFK1' 


3889 


A 


1 


1160 


LVVTAITAILAFPNEYTRMSTSELISELFNDCGLL 

DSSKJLCDYENRFNTSKGGELPDRPAGVGVYSAM 

WQLALTLILKIVITIFTFGMKIPSGLFIPSMAVGA1 

AGRLLGVGMEQLAYYHQEWTVFNSWCSQGAD 

CITPGLYAMVGAAACLGGVTRMTVSLVVIMFEL 

TGGLEYl\0>LMAAAlvrrSKWVADAJLGREGIYDA 
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SEQID 
NO: 

1 


| Method 


| Predicted 
I beginning 
1 nucleotide 
1 location 
1 corresponding 
I to first amino 
I add residue of 
I peptide 
[ sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=>Aspartic Acid, 
E=GlutBmic Add, F*=PhenyJa Janine, OGIydne, H-Hiatidine, 
l=Isoleucine, K^Lysine, L= Leucine, MMMethionine, 
N=Aspamgine, P=Proline, Q=Glutaraine f R^ArgJnine, S-Scrine, 
T=Threonioe, V«Valine, W«Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, ^possible nudeotlde deletion, 
\=possible nudeotide insertion 










HIIU.NGYPFLEAKEEFAHKTLAMDVMKPRRNDP 

IXTVLTQDSMTVEDVETIISETTYSGFPVVVSRES 

QRLVGFVLRJRDLnSIENARKKQDGWSTSIIYFTB 

HSPPLPPYTPPTLKIJlNnJM^P^ 

DIFRKLGLRQCLVTHNGRLLGnTKKDVLKHIAQ 

MANQDPDSILFN 


3890 


A 


1 


387 


SWCWTGIFVLGTTNLRLEGSWYRSLWGPGFNTT 
TATLGFGAPQAPVGDVALNQPDMCVYRRGRKK 
RVPYTKLQLKELENEYAINKFINKDKRRRISAAT 
NLSERQVTIWFQNRRVKDKKIVSKLKDTVS 


3891 


A 


2 


2914 


RGGGGDHKMADLSLLQEDLQEDADGFGVDDYS 

SESDVIHPSALDLAST/QDEMVERPLGRL\DK\YA 

ASENHI*PDKMVAPEFASIPLRE\VCDDERDCIAV 

LGKN*PDWADDSEPTVVRAAELEQVPHIALFLFK 

KTRLSmCFFSKFLLPYCGLDTLAJDQNXNQVRKT 

SQAALL\ALLEQEUERFDVETBCVCPVLIELTAPDS 

NDDVKTEAVAIMCKMAPXMVGKDITERLILPRFC 

EMCCDCRMFHWRKWCAANFGDICSWGQQAT 

EEMLLPRFFQLCSDNVWGVRKACAECFMAVSC 

ATCQEIRRTKLSALFINLISDPSRWVRQAAFQSLG 

PFISTFANPSSSGQYFKEESKSSEEMSVENNKRTR 

DQEAPEDVQVRPEDTPSDLSVSNSSVELENTMED 

HAAEASGKPLGEISVPLDSSLLCTLSSESHQEAAS 

NENDKKPGNYKSMLRPEVGTTSQDSALLDQELY 

NSFHFWRTPLPEIDLDIELEQNSGGKPSPEGPEEE 

SEGPVPSSPNITMATRKELEEMIENLEPHIDDPDV 

KAQVEVLSAALRASSLDAHEETISIEKRSDLQDE 

LDINELPNCKINQEDSVPLISDAVENMDSTLHYIH 

NDSDLSNNSSFSPDEERRTKVQDVVPQALLDQY 

LSMTDPSRAQTVDTEIAKHCAYSLPGVALTLGR 

QNWHCLRETY Kll^SDMQ WKVRRTL AFSIHELA 

VILGD\QLTAADLVPIFNGFLK*PSMKSRIGVLKH 

LHDFLKLLHIDKRREYLYQLQEFLVTDNSRNWR 

FRAELAEQLILLLELYSPRDVYDYLRPIALNLCAD 

KVSSVRWISYKLVSEN1VKKLHAATPPTFGVDLIN 

ELVENFGRCPKWSGRQAFVFVCQTVIEDDCLPM 

DQFAVHLMPHLLTLANDRVPNVRVLLAKTLRQT 

LLEKDYFLASASCHQEAVEQTIMALQMDRDSDV 

KYFASMP ASTKISEDAMSTAS STY 


3892 


A 


158 


2191 


VPLPAPSCiLSGGGSRGAGCKKAPPGRAPAPGLAP 

LRPSEPTMAVPPGHGPFSGFPGPQEHTQVLPDVR 

LLPRRLPLAFRDATSAPLRKLSVDLDCTYKHINEV 

YYAKKKRRAQ Q APPQD S SNKKEKKVLNHG YDD 

DNHDYIVRSGERWLERYEIDSLIGKGSFGQWKA 

YDHQTQELVADCIIKNKKAFLNQAQIELRLLELM 

NQHDTEMKYYIVHLKRHFMFRNXHLCLVFELLS 

YNLYDLLRNTHrTlGVSLNLTRKI^QQLCTAIXF 

LATPEL S IIHCDLKPENILLCNPBCRS A IKIVDFGSS 

CQLGQRIYQYIQSRFYRSPEVLLGTPYDLAIDMW 

SLGCILVEMHTGEPLFSGSNEVCPQEGVDQMNRI 

VE\HLGIPPAAMLDQAPKARKYFERLPGGGWTLR 

RTKELRKDYQGPGTTUU-QEVLGVQTGGPGGRRA 

GEPGHSPAD\Y\LRFQDLVLRMLEYEPAARISPLG 

ALQHGFFRRTADEATNTGPAGSSASTSPAPLDTC 

PSSSTASSISSSGGSSGSSSDNRTYRYSNRYCGGP 
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SEQID 
NO: 


j Method 

* 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to iirsi amino 

acid residue of 

peptide 

sequence 


Predicted end . 
nucleotide 
location 
corresponding 
to last amino 
acio residue oi 

peptide 
sequence 


Amino add sequence (A= Alanine OCystrtne, b=Aspartic Acid, 
E=Glutamic Add, F=Phenylalanine, OGlycine, H=>Jiistjdine, i 
I-lsoleucine, K=Lysine, L^Leudne, ^Methionine, ! 
N»Asparagine, IMProllne, Q=Glutamine, R^Arglnine, S=Scrine, 
T»Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
A— unknown, "=aiop coaon, /^possible nncteotide ddetion, 
^possible nucleotide insertion 










GPPITDCEMNSPQVPPSQPLRPWAGGDVPHKTH 

QAPASASSLPGTGAQLPPQPRYLGRPPSPTSPPPP 

ELMDVSLVGGPADCSPPHPAPAPQHPAASALRT 

RMTGGRPPLPPPDDPATLGPHLGLRGVPQSTAAS 

S 


3893 


A 


68 


258 


PEEYYPFSPTLQQLFFFLLDSDMGSRPESMGCRK 
NTVPRPASPTEAGTDPQTFLHTWVSECRD 


3894 


A 


1120 


136 


SLPLAPAPAVAGPVALCPAGLCPAQPGMPAGPA 

AASGSHPEVGSVLQRSSQPHWPNPWPGAGHLPP I 

PAGPFPYNPPAGPGAAAGLA*SPPRSSPTPCSVGP 

QSCPANASAPPAQPCLAGAPPAASLPPPGPGSVS 

AAPAPGGPAPAEPPLGVPPVPAWLLPDSPPLPGT 

HSGPPPAAVSIJPAAAACPVVVPPPLPHHPPDLES 

PSAAAPNPGCAGGIRHFPPGSPEASSPLRPAAAPA 

LLPLPRPPS*P/VPWKPLHSPVAVAGGSFVAGGSV 

LPAPDLDQPRPSGPPAASP1PGPGVAQPPPGSAVL 

PTVP*APPVSGAAPGRKREW 


3895 


A 


2 


1347 


FGAVSYRPGNGSCWVKVTASSDLSDUSCLCPPR 

SLCSSQACVLPVPGPSLLLPQGLHVGCASAGTRW 

PLSCSIDFQRLLAHEEETQKRRAKESGMAFTQLT 

FRDVAIEFSQDEWKCLNSTQRTLYRDVMLENYR 

NLVSLDLSRNCVEKELAPQQEGNP/ARSIPHSD1GT 

T*KT*H*RVLLQGNQEKNTRL*LSVER**KKLQQ 

SDYGPKRKS YL*ERPTR*KRYRKQVY*TS A\* LSF 

LPHPHELQQFQAEGKIYECNHVEKSVNHGSSVSP 

PQnSSTIKTHVSNKYGTDFICSSLLTQEQKSCIRE 

KPYRYIECDKALNHGSHMTVRQVSHSGEKGYKC 

DLCGKVFSQKSNLARHWRVHTGEKPYKCNECD 

RSFSRNSCLALHRRVHTGEKPYKCYECDKVFSR 

NSCLALHQKTHIGEKPYTCKECG QAFSVRSTLTN 

HQVIHSDK 


3896 


A 


202 


498 


MVQSCSAYGCKNRYDKDKPVSFHKFPLTRPSLC 
KJSWEAAVRRKNF1CPTKYSSICSEHFTPDCFKREC 
NNKLLKENAVP'1'IFLCTEPHDKKEDLLEPQEQ 


3897 


A 


2 


382 


SHGLSRAPHLSAAPAPALASRPCFSSAPCSQGGG 
GGGPATMMFILLFSRQGKLRLQKWYITLPDKER 
KKITREIVQIILSRGHRTSSFVDWKELKLVYKRYA 
SLYFCCAIEVNQDNELLTLENVHR 


3898 


A 


718 


305 


SEQEPLLGDTPGSREWD1LETEEHYKSRWRSIRIL 
YLTMFLSSVGFSVVMMSIWPYLQKIDPTADTSFL 
GWVIASYSLGQMVASPIFGLWSNYRPRKEPLIVSI 
LISVAANCLYAYLHEPASHNKYYMLVARGLLGIG 


3899 


A 


24 


718 


FRGRPGIPEREGKGNHSF VEVARVI WDLH SRLG 

GAMAERKGTAKVDFLKKIEKEIQQKWDTERVFE 

VNASNLEKQTSKGKYFVTFPYPYNINGRLHLGHT 

FSLSKCEFAVGYQRLKGKCCLFPFGLHCTGMPIK 

ACADKLKREIELY/GCPPDFPDEEEEEEETSVKTE 

DHIKDKAKGKKSKAA/AKAGSSKYQWGIMKSLG 

I^DEEIVKFSEAEHWLDYFNALAIQDLKRMG 


3900 


A 


360 


1 


VPATSSNVSPSSSESSEPDLSSRSSSSDAPSSSPSVP 
SPCSLSLSSPESPLLPTLLSSKSPAGSAGPTCGCPS 
GPGLRATA/PSRLSSSIAAH/SSSAPETSRPAAARE 
RSPPLHDRESHE 


3901 


A 


193 


345 


GEWAVPPAPGGQGVSIPHGPEPGQGSGVHIAPRQ 
GEGSDRTEPLICPKAAP 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
| corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine C=Cysteine, D^Aspartic Add, 
E=Clutaraic Add, ^Phenylalanine, G«Glycine, H=Histidine, 
l»IsoIeucine, K^Lysine, L^LeucInt, (VI=Metbionlne, 
NaAsparaglne, P«ProIine, Q=Glutaraine, R^Arglnint, S^Serine, 

warn a m ■m » ft ■ W 9 fM A w • 

T^OTireonlne, V«=VaIine, W=Tryptophan, Y=a>rosine, 
X=Unknown, *=Stop codon, ^possible nndeotide deletion, 
\= possible nucleotide insertion 


3902 


A 


1188 


1389 


NPAARSAAAREGSPALPPPPVS/SSSGLGLLLPLSP 
PG SHAANP ALSPRAPHSH YRPRPRCGPRRRPR 


3903 


A 


63 


396 


NNMRNPHLSSNHYLN1^\RTETWAR^ 
LAPGKEGLKNFAGKSLGQIYRVLEKKQDTGETEE 
LTEDGKPL* VPERKAPLCDCTCFGLPRRYIIAIMS 
GLGFCISFG 


3904 


A 


732 


1046 


AMSECPLILYIHKHIDTYSQSYLFNDLFYPVYSGG 
RMVTYEHLREVVFGKSEDEHYPLW*VLFGK*YA 

V APN ALMFIRFM*NCTFVPKLP* VMDLK* *LQYK 
SR 


3905 


A 


46 


910 


QPPPPPPPPPSPPPPPFPPARALSHLRLHPDACLFPS 

PFPLPCSTMPGMMEKGPELLGKNRSANGSAKSP 

AGGGGSGASSTNGGLHYSEPESGCSSDDEHDVG 

MRVGAEYQARIPEFDPGATKYTDKJDNGGMLVW 

SPYHSIPDAKLDEYIAIAKEKHGYNVEQALGMLF 

WHKHhOEKSLADLPNFITFPDEWTVEDKVLFEQ 

AFSFHGKSFHRIQQMLPDKTIASLVKYYYSWKK 

TRSRTSLMDRQARKLANRHNQGDSDDDVEETHP 

MDGNDSDYDPKKEAKKEGMS 


3906 


A 


2 


513 


KVCNCCSQELETSFTYVDKNINLEQRNRSSPSAK 

GHNHPGELGWENPNEWSQEAAISLISEEEDDTSS 

EATSSGKSIDYGFISAILFLVTGILLVIISYTVPREV 

TVDPNTVAAREMERLEKESARLGAHLDRCV1AG 

LCLLTLGGVtLSCLLMMSMWKGELYRRNRFAS 


3907 


A 


71 


412 


ILIMSNCLQNFLKITSTRLLCSRLCQQLRSKRKFF 
GTVPISRLHRRVVITGIGLVTPLGVGTHLVWDRLI 
GGESGIVSLVGEEYKSIPCSVAAYVPRGSDEGQF 
NEQNFVSKSD 


3908 


A 


77 


746 


LGTLLGWRAPLFSRCLAFHSPFDLLNTPKLVKTAE 

LPPDRNYVLGAHPHGIMCTGFLCNFSTESNGFSQ 

LFPGLRPWLAVLAGLFYLPVYRPYIMSFGLCPVS 

RQSLDFILSQPQLGQAVVIMVGGAHEALYSVPGE 

HCLTLQKRKGFVRLALRHGASLVPVYSFGENDIF 

RLKAFATGSWQHWCQLTFKKLMGFSPCIFWGR 

GLFSATSWGLLPFAVPITTV 


3909 


A 


1 


793 


FRAAGRPAAAMGDIPWGLSSWKASPGKVTEAV 

KEAEDAGYRHFDCAYFYHNEREVGAGIRCKIKE 

GAVRREDIXIATKjLWCTCHKKSLVETACRKSLK 

ALKLNYLDLYLKWPMGFKPPHPEWIMSCSELSF 

CLSHPRVQDLPLDESNMVTPSDTDFLDTWEAME 

DLXOTGLVKNIGVSNFNHEQLERLLNKPGLRFKP 

LTNQIECHPYLTQKNLISFCQSRDVSVTAYRPLG 

GSCEGVDLIDNPVIKRIAKEHGKSPAQILI 


3910 


A 


202 


705 


FFTMHRKKVDNRIRILIENGVAERQRSLFVVVGD 

RGKD QVVILHHMLSKATVKARPSVL WCYKKEL 

GFSSHRKKRMRQLQKKJKNGTLNIKQDDPFELFI 

AAThnRYCYYNETHKILGNTFGMCVLQDFEALTP 

NIXARTVETVEGGGLVVILLRTMNSLKQLYTVT 

M 


3911 


A 


3 


723 


AGRGARAAGEGGGPFKSRPRPLPSSRSLPAVGGG 
RYG ADKMAAGGA V AAAPECRLLPY ALHK WS SF 
SSTYLPENILVDKPNDQSSRWSSESNYPPQYLILK 
LERPATV QNnTGKYEKTHVC^IJKKJFKVFGGMN 
EENMTELLSSGLKNDYNKETFTLKHKIDEQMFPC 
RFIKIVPIXSWGPSFNFSIWYVELSGIDDPDIVQPC 
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SEQID 
NO: 



3912 



3913 



Method 



3914 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



362 



1 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



461 



20 



7545 



Amino acid sequence (A»Alanlne OCysteine, D= As par tic Add, 
E^GIutamic Acid, ^Phenylalanine, G=G!ycine, H-Histidine, 
Msoleucinc, K«Lys!ne, L»Leudne, M-Methionine, 
N«Asparagine, P*»Proline, Q=Glutamine, R«Arginlne, S=Serine, 
T-Threonine, V»Val ine, W«Tryptophan, Y-Tyroslne, 
X=Unknown, *«Stop codon f /^possible nucleotide deletion, 
V*possibl€ nudeotide insertion 



LNAVYSKYREQEAIRLCLKHFRQHNYTEAFESLQ 
KKT 



FEKKQLRRPSLFLLGCCSFGIMAPSLAVKGLEGIG 

LFALAHAAFSAAQHRSYMRLTEKEDESI^IDIVL 

QTL1AFAVTCYGIVHIAGEFKDMDATSELKNKTF 

DTVRNHPSFYVFNHRGSEYFSGPSDTANSSNQDA 

LSSNTSLKLRKLESLRR 



APGRPEAKVPERSRESGSRRVRGPLLQLRPGRTS 
RPASGRGRGGAGGSYGKMRKPDSKTVLLGDMN 
VGKTSLLQRYMERRFPDTVSTVGGAFYLKQWRS 
YNISIWDTAGEAGAA 



PGIRVGITSQTGLSSNLQENCSKLAFISSHGTEKQ 

LQCMPMEGRGRASSSISDLQGKGFEKGTGEKHV 

PGVGSARHSPQASAGGSPWQRGKAQTRWLGKP 

DPGRKRRRGSPQEEGGLRVSAAAJEUXCSGANRC 

KVLVRQNSTPNTQQPAVHPSTPPSRPLPQAGRCL 

VAPLRPHPDWVAAKTLAKAXRAPGKPWRLAAP 

SPLGDLGAPGLPGPSTAPRTLSVEEPGVECNQLC 

L YADVTDP VLCLGQKDPG VEGKHCEKEKISS SK 

ELKHVHAKSEPSKPAJUU^ESLrTVVDENKNESKI 

EREHKRRTSTPVIMEGVQEETDTRDVKRQ VERSE 

ICTEEPQKQKSTLKNEKHLKKDDSETPHLKSLLK 

KEVKSSKEKPEREKTPSEDKLSVKHKYKGDCMH 

KTGDETELHSSEKGLKVEENIQKQSQQTKLSSDD 

KTERKSKHRNERKLSVLGKDGKPVSEYHKTDEN 

VRKENNKJKERRLSAEKTXAEHKSRRSSDSKIQK 

DSLGSKQHGITLQRRSESYSEDKCDMDSTNMDS 

NLKPEEWHKEKRRTKSLLEEKLVLKSKSKTQG 

KQVKVVETELQEGATKQATTPKPDKEKNTEEND 

SEKQRKSKVEDKPFEETGVEPVLETASSSAHSTQ 

KDSSHRAKLPLAKEKYKSDKDSTSTRLERKLSD 

GHKSRSLKHSSKDIKKKDENKSDDKDGKEVDSS 

HEKARGNSSLMEKKLSRRLCENRRGSLSQEMAK 

GEEKLAANTLSTPSGSSLQRPKKSGDMTLIPEQEP 

ME1DSEPGVENVPEVSKTQDNRNNNSHQDIDSEN 

MKQKTSATVQKDELRTCTADSKATAPAYKPGR 

GTGVNSNSEKHADHRSTLTKKMHIQSAVSKMNP 

GEKEPIHRGTTEVNIDSETVHRMLLSAJPSENDRV 

QKNUCNTAAEEHVAQGDATLEHSTNLDSSPSLSS 

VTVVPLRESYDPDVIPLFDKRTVLEGSTASTSPAD 

HSALPNQSLTVRESEVLKTSDSKEGGEGFTVDTP 

AKASITSKRHIPEAHQATLLDGKQGKVIMPLGSK 

LTGVIVENENITKEGGLVDMAKKENDLNAEPNL 

KQTIKATVENGKJODGIAVDHWGLNTEKYAETV 

KLKHKRSPGKVKDISIDVERRNENSEVDTSAGSG 

SAPSVLHQRNGQTEDVATGPRRAEKTSVATSTE 

GKDKDVTLSP VKAGP ATTTS SETRQSEV ALPCTS 

IEADEGLnGTHSRNNPLHVGAEASECTVFAAAEE 

GGAVVTEGFAESETFLTSTKBGESGECAVAESED 

RAADLLAVHAVKIEANVNSVVTEEKDDAVTSAG 

SEEKCDG SLSRDSErVEGTITFISEVESDGA VTSAG 

TEIRAGSISSEEVDGSQGNMMRMGPKKETEGTV 

TCTGAEGRSDNFVICSVTOAGPREERMVTGAGV 

VLGDNDAPPGTSASQEGDGSVNDGTEGESAVTS 

TGITEDGEGPASCTGSEDSSEGFAISSESEENGESA 
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SEQ ID 

NO: 


Method 

• 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine O Cysteine, D=AspartIc Acid, 
E<=Glutamic Acid, F«Phenylalanine, G=Glydne 9 H<*Histfdine, 
I=Iso!eurine, K=Lyslne, LHLeucine, M=Methionine, 
N=Asparaglne, P^Proline, Q=Glutaraine, R=Arglnine, S^S erine, 
T«Threonine, V»Valine, W-Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop cod on, ^possible nucleotide deletion, 
^possible nucleotide insertion 










MDSTVAKEGTNVPLVAAGPCDDEGIVTSTGAKE 
EDEEGEDWTSTGRGNEIGHASTCTGLGEESEGV 
LICESAEGDSQIGTVVEHVEAEAGAAIMNANENN 

VDSMSGTEKGSKDTDICSSAKGIVESSVTSAVSG 
KDEVTP VPGG CEGPMTS AA SDQSD SQLEKVEDT 

TISTGLVGGSYDVLVSGEVPECEVAHTSPSEKED 

EDIITSVENEECDGLMATTASGDITNQNSLAGGK 

NQGKVLnSTSTTNDYTPQVSAITDVEGGLSDALR 

TEENMEGTRVTTEEFEAPMPSAVSGDDSQLTASR 

SEEKDECAMISTSIGEEFELPISSATTIKCAESLQP 

VAAAVEERATGPVLISTADFEGPMPSAPPEAESP 

LASTSKEEKDECALISTS1AEECEASVSGWVESE 

NERAGTVMEEKDGSGIISTSSVEDCEGPVSSAVP 

QEEGDPSVTPAEEMGDTAMISTSTSEGCEAVMIG 

AATLQDEDRLTITRVEDLSDAAnSTSTAECMPISA 

SIDRHEKNQLTADNPEGNGDLSATEVSKHKVPM 

PSLIAENNCRCPGPVRGGKEPGPVLAVSTEEGHN 

GPSVHKPSAGQGHPSAVCAEKEEKHGKJECPEIGP 

FAGRGQKESTLHLINAEEKNVLLNSLQKEDKSPE 

TGTAGGSSTASYSAGRGLEGNANSPAHLRGPEQ 

TSGQTAKDSS VS SIRYLA A VNTGAIKADDMPPVQ . 

GTVAJSHSFLPAEQQG SEDNLKTSTTKCITGQESKI 

APSHTMIPPATYSVALLAPKCEQDLTIKNDYSGK 

WTDQASAEKTGDDNSTRKSFPEEGDIMVTVSSE 

ENVCDIGNEESPLhTVLGGLKLKANLKMEAYVPS 

EEEKNGEILAPPESLCGGKPSGIAELQREPLLVNE 

SLNVENSGFRTNEEIHSESYNKGEISSGRKDNAE 

AISGHSVEADPKEVEEEERHMPKRKRKQHYLSSE 

DEPDDNPDVLDSRIETAQRQCPETEPHATKEENS 

RDLEELPKTSSETNSTTSRVMEEKDEYSSSETTGE 

KPEQNDDDTIKSQE 


3915 


A 


1 


7545 


PGIRVGITSQTGLSSNLQENCSKLAFISSHGTEKQ 

LQCMPMEGRGRASSSISDLQGKGFEKGTGEKHV 

PGVGSARHSPQASAGGSPWQRGKAQTRWLGKP 

DPGRKRRRGSPQEEGGLRVSAAARLLCSGANRC 

KVLVRQNSTPNTQQPAVHPSTPPSRPLPQAGRCL 

VAPLRPHPDWVAAKTLAKALRAPGKPWRLAAP 

SPLGDLGAPGLPGPSTAPRTLSVEEPGVECNQLC 

LYADVTDPVLCLGQKDPGVEGKHCEKEKISSSK 

ELKHVHAKSEPSKPARRLSESLHVVDENKNESKI 

EREHKRRTSTP V1MEG VQEETDTRDVKRQ VERSE 

ICTEEPQKQKSTLKNEKHLKKJDD SETPHLKSLLK 

KE VKSSKEKPEREKTP SEDKLS VKHKYKGD CMH 

KTGDETELHSSEKGLKVEENIQKQSQQTKLSSDD 

KTERKSKHRNERKLSVLGKDGKPVSEYIIKTDEN 

VRKENNKKERRLSAEKTKAEHKSRRSSDSKIQK 

DSLGSKQHGITLQRRSESYSEDKCDMDSTNMDS 

NLKPEEVVHKEKRRTKSLLEEKLVLKSKSKTQG 

KQVKVVETELQEGATKQATTPKPDKEKN1HEND 

SEKQRKSKVEDKPFEETGVEPVLETASSSAHSTQ 

KDSSHRAKLPLAKEKYKSDKDSTSTRLERKLSD 

GHKSRSLKHSSKDIKKKDENKSDDKDGKEVDSS 

HEKARGNS SLMEKKLSRRLCENRRGSLS QEMAK 

GEEKL AANTLSTPSG S SLQRPKKSGDMTLIPEQEP 

MEIDSEPGVENVFEVSKTQDNRNNNSHQDIDSEN 
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SEQID 
NO: 


Method 


1 Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 
1 acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A°Alanine C-Cysteine, EH>Aspartic Acid, 
E«Glutamlc Add, F-Fhenylalanine, 0*Glycine, H«Histidine, 
Msoleucine, K-Lysine, Lf=Leucine, M»Methion(ne, 
N^Asparagine, P-Prolioe, Q~Glutamine» R^Arginine, S=Serine, 
T-Threonine, V-Valine, W-Tryptophan, Y-T>rosinc, 
X=Un known, *-Stop cod on, /-possible nucleotide deletion, 
V-possible nucleotide insertion 




• 






MKQKTSATVQKDELRTCTADSKATAPAYKPGR 

GTGVNSNSEKHADHRSTLTKKMKQSAVSKMNP 

GEKEPIHRGTTEVNIDSETVHRMLLSAPSENDRV 

QKNLKNTAAEEHVAQGDATLEHSTNLDSSPSLSS 

VTVVPLRESYDPDVIPLFDKRTVLEGSTASTSPAD 

HSALPNQSLTVRESEVLKTSDSKEGGEGFTVDTP 

AKASITSKRHIPEAHQATLLDGKQGKVIMPLGSK 

LTGVTVTENEhnTKEGGLVDMAKKENDLNAEPNL 

KQTIKATVENGKJKDG IA VDHVVGLNTEK Y AETV 

KLKHKRSPGKVKDISIDVERRNENSEVDTSAGSG 

SAPSVLHQKNGQTEDVATGPRRAEKTSVATSTE 

GKI>KX>VTLSPVKAGPATTTSSETRQSEVALPCTS 

IEADEGLIIGTHSRNNPLHVGAEASECTVFAAAEE 

GGAWTEGFAESETFLTSTKEGESGECAVAESED 

RAADLLAVHAVKIEANVNSVVTEEKDDAVTSAG 

SEEKCDGSLSRDSEIVEGTITFISEVESDGAVTSAG 

TEIRAGSISSEEVDGSQGNMMRMGPKKETEGTV 

TCTGABGRSDNFVICSVTGAGPREERMVTGAGV 

VLGDNDAPPGTSASQEGDGSVNDGTEGESAVTS 

TGITEDGEGPASCTG SEDSSEGFAISSESEENGES A 

MDSTVAKEGTNVPLVAAGPCDDEGIVTSTGAKE 

EDEEGEDWTSTGRGNEIGHASTCTGLGEESEGV 

LICESAEGDSQIGTWEHVEAEAGAAIMNANENN 

VDSMSGTEKGSKDTDICSSAKGIVESSVTSAVSG 

KDEVTPVPGGCEGPMTSAASDQSDSQLEKVEDT 

TISTGLVGGSYDVLVSGEVPECEVAHTSPSEKED 

EDIITSVENEECDGLMATTASGDITNQNSLAGGK 

NQGKVLIISTSTTNDYTPQVSAITDVEGGLSDALR 

TEENMEGTRVTTEEFEAPMPS A VS GDDSQLTASR 

SEEKDECAMISTSIGEEFELPISSATTIKCAESLQP 

VAAAVEERATGPVLISTADFEGPMPSAPPEAESP 

LASTSKEEKDECALISTSIAEECEASVSGVVVESE 

NERAGTVMEEKDGSGIISTSSVEDCEGPVSSAVP 

QEEGDPSVTPAEEMGDTAMISTSTSEGCEAVMIG 

AVLQDEDRLTITRVEDLSDAAUSTSTAECMPISA 

SIDRHEENQLTADNPEGNGDLSATEVSKHKVPM 

PSLIAENNCRCPGPVRGGKEPGPVLAVSTEEGHN 

GPSVHKPSAGQGHPSAVCAEKEEKHGKECPEIGP 

FAGRGQKESTLHLINAEEKNVLLNSLQKEDKSPB 

TGTAGGSSTASYSAGRGLEGNANSPAHLRGPEQ 

TSGQTAKDSSVSSIRYLAAVNTGAIKADDMPPVQ 

GTVAEHSFLPAEQQG SEDNLKTSTTKCITGQESKI 

APSHTMIPPATYSVALLAPKCEQDLTIKNDySGK 

WTDQASAEKTGDDNSTRKSFPEEGDIMVTVSSE 

ENVCDIGKEESPLNVLGGLKLKANLKMEAYVPS 

EEEKNGEIL APPESLCG GKPSGIAELQREPLL VNE 

SLNVENSGFRTNEEIHSESYNKGEISSGRKDNAE 

AISGHS\^ADPKEVEEEERHMPKRKRKQHYLSSE 

DEPDDNPDVLDSRIETAQRQCPETEPHATKEENS 

RDLEELPKTS SETNSTTSR VMEEKDEYS S SETTGE 

KPEQKDDDTIKSQE 


3916 


A 


2 


773 


GPFGVLWPSAKPGPVTAVEARPPDASDPEGLRG 
GSPAPLLAPGPLDPSGRLHPAVSMMSYLKQPPYG 
MNGLGLAGPAMDLLHPSVGYPATPRKQRRERTT 
FTRSQLDVLEALFAKTRYPDIFMREEVALKINLPE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last o mi no 
acid residue of 
peptide 
sequence 


Amino add sequence (A»Alanine OCystelne, D=Aspartic Add, 
E=»Glutamlc Add, ^Phenylalanine. G=Glydne. H=HlsHdint. 
lalsoleudne, K=Lysine, L^Leudne, M=Metnlonlne, 
N»Asparaglne, P»Prollne, Q^GIutamine, R«Arginine, S=Serine, 
T^Threonlne, V^Vallne, W^ryptoptaan, Y^Tyrosine, 
X^Unknown, *=Stop codon, ^possible nudeotide deletion, 
Y=possible nudeotide insertion 










SRVQVVVTKNRRAKCRQQQQSGSGTKSRPAKKK 
SSPVRESSGSESSGQFIPPAVSSSASSSSSASSSSA 
NPAAAAAAGLVVAKLPCPLHIFSLCVF1EENRLV 
SGSWARDIRSVBETDKSGYR 


3917 


A 


2 


776 


RNIPGRRFRPPGIJRRLIXGPHMPREPRGYRTRVP 

ALRELVPSSHAGSGASEHCQNNRQGSRQHRASR 

NVQAGGAIJ^PRHLCGLCSRLHFLKPDLSVRAA 

PSRAGASVMALRKELLKSIWYAFTALDVEKSGK 

VSKSQLRVLSHNLYTVIJilPHDPVALEEHFRDDD 

DGPVSSQGYMPYLNKY1LDKVEEGAFVKEHFDE 

LCWTLTAKKNYRADSNGNSMLSNQDAFRLWCL 

FNFLSEDKYPLIMDPDEGEYLLKRYS 


3918 


A 


10 


318 


WQDLVCLGGSRAQEQKPLQQLWNAILLVAMLL 
CTGLWQAQRQASRQSQRELGGQVDLFKRRW 
RRLASLKTRRCRLSRAAQGLPDPGAETCAVCLD 
YECNKQ 


3919 


A 


1 


204 


RVLTAINHTLKEm.RKFYKGKKDKPLDLRPKKT 
RAMRRRLNMHEENLKTKKQHRKERLYPLRKYA 
AKA 


3920 


A 


1 


654 


RCCRSFVAPLQEKWFGLFFLGAILCLSFSWLFHT 

VYCHSEGVSRLFSKLDYSG1ALLIMGSFVPWLYY 

SFYCNPQPCFTYLIVICVLGIAAIIVSQVVDMFATPQ 

YRGVRAGVFLGLGLSGIIPTLHYVISEGFLKAATI 

GQIGWXMLMASLYITGAALYAARIPERFFPGKCD 

I WFHSHQLFHIFVV AG AFVHFHG V SNLQEFRFMI 

GGGCSEEDAL 


3921 


A 


1587 


452 


LERDGCGGEEGGSVRSGAGPDSDPRGASSPPAG 

HRGTAASPRPVAAPSRTPAPPHTRARASPGLPSG 

PAWRRVQWFSRVSGQVSTLMKATVLMRQPGRV 

QEIVGALRKGGGDRLQV1SDFDMTLSRFAYNGK 

RCPSSYNILDNSKIISEECRKELTALLHHYYPIEID 

PHRTVKEKLPHMVEVVVVTKAHhILLCQQKIQKFQI 

AQVVRESNAMLREGYKTFFNTLYHNNIPLFIFSA 

GIGDILEEORQMKVFHFNIHIVSNYMDFNEDGFL 

QGFKGQLIHTYNKNSSACENCGYFQQLEGKTNV 

IIXGDSIGDLTMADGVPGVQNILKIGFLNDKVEE 

RRERYMDSYDIVLEKDETLDVVNGLLQHILCQG 

VQLEMQGP 


3922 


A 


2 


164 


GKIYQRAFGGHSLKFGKGVQAHGCCCVADRTG 
HSILHTSYGRERPAPVHLRQDT 


3923 

* 


A 


2 


3258 


EHATHAYAKLGTRRRHREVTVFVPTWQLKKNR 

RVRESHFLTKLHSLKMLSITPSQLENGKJKJTTYD 

YRFMV1CLAEETDGIIVTNEQIHILMNSSKKLMVK 

DRLLPFTFAGNLFMVPDDPLGRDGPTLDEFLKKP 

NRLDTDIGNFLKVVVKTLPPSSASVTELSDDADSG 

PLESLPNMEEVREEKEERQDEEQRQGQGTQKAA 

EEDDLDSSLASVFRVECPSLSEEILRCLSLHDPPD 

GALDIDLLPGAASP YLGIP WDGKAPC QQVLAHL 

AQLTDPSNFTALSFFMGFMDSHRDAIPDYEALVG 

PLHSIXKQKPDWQWDQEHEEAFLALKRALVSAL 

CLMAPNS QLPFRLEVTVSHV ALTAILHQEHS GRK 

HPIAYTSKPLLPDEESQGPQSGGDSPYAVAWALK 

HFSRCIGDTPWLDLSYASRTTADPEVREGRRVS 

KAWLIRWSLLVQDKGKRALELALLQGLLGENRL 

LTPAASMPRFFQVXPPFSDLSTFVCIHMSGYCFYR 
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SEQIO 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 


Amino acid sequence (A-AIanine OCysteinc, D»Aspartic Acid, 
E-Glutamic Add, F-Phenylalanine, OGIycIne, H-Histidioe, 
J-lsoieocine, K^Lysine, LHLeucine, M B Methioninc, 
N cs Asparagine f P=Proline, Q=GIutamioe t R=Arginine, S»Serinc, 
T>Threonine, V-Valine, W=»Tryptophan, Y«Tyrosine, 
X»Unknown, *=Stop cod on, ^possible nucleotide deletion, 
V=possible nucleotide insertion 




* 




• 


EDEWCAGFGLYVLSPTSPPVSLSFSCSPYTPTYA 

HLAAVACGLERFGQSPLPVVFLTHCNWIFSLLWE 

LXPLWRARGFLSSDGAPLPHPSLLSYIISLTSGLSS 

LPFIYRTSYRGSLFAVTVDTLAKQGAQGGGQWW 

SLPKDVPAPTVSPHAMGKRPNLLALQLSDSTLAD 

IIARLQAGQKLSGSSPFSSAFNSLSLDKESGLLMF 

KGDKKPRVWVVPTQLRRDLIFSVHDIPLGAHQR 

PEETYKKLJILLGWWPGMOEHVKDYCRSCLFCIP 

RNLIGSELKVEESPWPLRSTAPWSNLQIEWGPVT 

ISEEGHKHVLIVADPNTRWVEAFPLKPYTHTAVA 

QVLLQHVFARWGVPVRJLEAAQGPQFARHVLVS 

CGLALGAQVASLSRDLQFPCLTSSGAYWEFKRA 

LKErTFLHGKKWAASLPLLHLAFRASSTDATPFK 

VLTGGESRLTEPLWWEMSSANIEGLKMDVFLLQ 

LVGELLELHWRVADKASEKAENRRFKRESQEKE 

WNVGDQVLLLSLPRNGSSAKWVGPFYIGDRLSL 

SLYRIWGFPTPEKXGCIYPSSLMKAFAKSGTPLSF 

KVLEQ 


3924 


A 


1 


1826 


MGSVTVRYFCYGCLFTSATWTVLLFVYFNFSEV . 

TQPLKNVPVKGSGPHGPSPKKFYPRFTRGPSRVL 

EPQFKAhTKIDDVroSRVEDPEEGHLKFSSELGMIF ! 

NERDQELRDLG YQKHAFNMLI SDRLG YHRD VPD 

TRNAACKEKFYPPDLPAASWICFYNEAFSALLR 

TVHSV1DRTPAHLLHEIILVDDDSDFDDLKGELDE 

YVQKYLPGKIKVIRNTKREGLIRGRMIGAAHATG 

EVLVFLDSHCEVNVMWLQPLLAAIREDRHTVGC 

P VIDIIS ADTLA YSS SP V VRG GFNWGLHFK WDL V 

PLSELGRAEGATAPIKSPTMAGGLFAMNRQYFH 

ELGQYDSGMDIWGGENLEISFRIWMCGGKLFIIP 

CSRVGHIFRKRRPYGSPEGQDTMTHNSLRLAHV 

WLDEYKEQYFSLRPDLKTKSYGNISERVELRKKL 

GCKSFKWYLDNVYPEMQIS GSHAKPQQPIFVNR 

GPKRPKVLQRGRLYHLQTNKCLVAQGRPSQKG 

GLWLKACDYSDPNQIWIYNEEHELVLNSLLCLD 

MSETRSSDPPRLMKCHGSGGSQQWTFGKNNRLY 

QVSVGQCLRAVDPLGQKGSVAMAICDGSSSQQ ! 

WHLEG 


3925 


A 


5386 


2897 


VRWNSKTECYLSIQTQENFPANLNELVNCIVISSL 

VTTQRKLKAMSLLGSRNQLARAVLNPNPMDFCT 

KDLLTTTSERIIAYLRDFNEDQKKAIBTAYAMVK 

HSPSVAKICLBHGPPGTGKSKTIVGLLYRLLTENQ 

RKGHSDENSNAKJKQNRVLVCAPSNAAVDELM 

KKJILEFKEKCKDKKNPLGNCGDINLVRLGPEKSI 

NSEVLKFSLDSQV>fHRMKKELPSHVQAMHKRK 

EFLDYQLDELSRQRALCRGGREIQRQELDENISK 

VSKERQELASKIKEVQGRPQKTQSIIDLESHIICCT 

LSTSGGLLLESAFRGQGGVPFSCVIVDEAGQSCEI 

ETLTPLIHRCWKLILVGDPKQLPPTVISMKAQEYG 

YDQSMMARFCRLLEEhTVmHNMISRLPlLQLTVQ 

YRMHPDICLFPSNYVYNRNLKTNRQTEAIRCSSD 

WPFQPYLVFDVGDGSERRD^SYINVQEIKLVM 

EIIKLIKDKRKDVSFRNIGIITHYKAQKTMIQKDL 

DKEFDRKGPAEVDTVDAFQGRQKDCVIVTCVRA 

NSIQGSIGFIASLQRlJ^TrnUKYSLFILGHLRTL 

MENQHWNQLIQDAQKRGAIIKTCDKNYRHDAV | 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

io iirsi amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acta resiooe oi 
peptide 
sequence 


Amino add sequence (A- A la nine C=Cysteine, D=Aspartic Add, | 
E»Glutamic Acid, F-Fbenylolanlne, G^Glydne, H»Hjsttdine, 
l«]soleudne, K»Lysine, L^Leudne, M^Methionine, 
N-Asparagine, P^Proline, Q-Glu taraine, R°Arginine, S=Scrine, 
T«=Thrconine, V«Valine, W«Tryptophan, Y-Tyrosine, 
A a uo Known, "oiop cooon, /^possiDie nucleotide deletion, 
V- possible nucleotide insertion 








■ 


KILKLKPVLQRSLTHPPTIAPEGSRPQGGLPSSKJL 

DSGFAKTSVAASLYHTPSDSKEITLTVTSKDPERP 

PVHDQLQDPRLLKRMGIEVKGGIFLWDPQPSSPQ 

HPGATPPTGEPGFPWHQDLSHVQQPAAVVAAJL 

SSHKPPVRGEPPAASPEASTCQSKCDDPEEELCH 

RREARAFSEGEQEKCGSETrDTIRRNSRWDKRTL 

EQEDSSSKKRKLL 


3926 


A 


99 


284 


MPREDRATWKSNYFLKIIQLLDDYPKRFIVGANN 
VGSKQMQQIRMSLRGKAVVLMGKI^TMMR 


3927 


A 


542 


2 


AHLLMLNLAL\TDLL\YLTSLPFLIHYYASGENWI 
FGDFMCKFIRFSFHFNLYSSILFLTCFSIFRYCVIIH 
PMSCFSIHKTRCAWACAWWnSLVAVIPMTFLI 
TSTI^TORSACLDLTSSDELNTIKWY>nLILTA\LL 
CLPLVIVTLCYTTIIHTLTHGHANXDSCLKQKARR 
LTDLLL 


3928 


A 


1 


1516 


GEEAVGGGAEGGGFGVGAQGRAGGRGVEAGR \ 

MRLSKTLVDMDMADYSAALDPAYTTLEFENVQ 

VLTMGNDTSPSEGTOLNAPNSLGVSALCAICGDR 

ATGKHYGASSCDGCKGFFRRSVRKNHMYSCRFS 

RQCVVDKDKRNQCRYCRLKKCFRAGMKKEAV 

QNERDRISTRRSSYEDSSLPSINALLQAEVLSRQIT 

SPVSGINGDIRAKKIASIADVCESMKEQLLVLVE 

WAKYIPGFCELPLDDQGALLRAHAGEHLLLGAT 

KRSMVFKDVLLLGNDYIVPRHCPELAEMSRVSIR 

ILDELVLPFQELQIDDNEYAYLKAIIFFDPDAKGL 

SDPGKIKRLRSQVQVSLEDYINDRQYDSRGRFGE 

LLLLLPTLQSITWQMIEQIQFIJ^FGMAKIDNLLQ 

EMLLGGSPSDAPHAHHPLHPHLMQEHMGTNVIV 

ANTMPTHLSNGQMCEWPRPRGQAATPETPQPSP 

PGASG SEPYKLLPG AVATTVKPLSAIPQPTITKQE 

VI 


3929 

* 


A 


1 

• 


2782 

■ 


RVLSLESPLEKDPRVLGAQSVPRGRALKGLSPLG 

LDSAFRLFPDPRAGPWNTAVLSSGMEPETALWG 

PDLQGPEQSPNDAHRGAESENEEESPRQESSGEEI 

1MGDPAQSPESKDSTEMSLERSSQDPSVPQNPPTP 

LGHSNPLDHQBPLDPPAPEVVPTPSDWTKACEAS 

WQWGALTTWNSPPVVPANEPSLRELVQGRPAG 

AEKPYICNECGKSFSQWSKLLRHQRIHTGERFNT 

CSECGKSFTQSSHLVQHQRTHTGEKPYKCPDCG 

KCFSWSSNLVQHQRTHTGEKPYKCTECEKAFTQ 

STNLIKHQRSHTGEKPYKCGECRRAFYRSSDLIQ 

HQATHTGEKPYKCPECGKRFGQNHNLLKHQKIH 

AGEKPYRCTECGKSFIQSSELTQHQRTHTGEKPY 

ECLECGKSFGHSSTLIKHQRTHLREDPFKCPVCG 

KTFTLSAT1XRHQRTHTGERPYKCPECGKSFSVS 

SNLINHQRIHRGERPYICADCGKSFIMSSTLIRHQ 

RIHTGEKPYKCSDCGKSFIRSSHLIQHRRTHTGEK 

PYKCPECGKSFSQSSNLITHVRTHMDENLFVCSD 

CGKAFLEAHELEQHRVIHERGKTPARRAQGDSL 

LG1X5DPSLLTPPPGAKPHKCLVCGKGFNDEGEFM 

QHQREHQGENPYKNADGLIAHAAPKPPQLRSPRL 

PFRGNSYPGAAEGRAEAPGQPLKPPEGQEGFSQR 

RGLLSSKTYICSHCGESFLDRSVLLQHQLTHGNE 

KPFLFPDYRIGLGEGAGPSPFLSGKPFKCPECKQS 

FGLSSELLLHQKVHAGGKSSHKSPELGKSSSVLL 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 


Amino acid sequence (A»AIanlne OCysteine, D=Aspartk Acid, 
E^GIuUmic Add, ^Phenylalanine, G=Glycine, H»Histldine, 
J=Isoleucine, K=Lysine, D=Leucine, (^Methionine, 
N=Asparagine, P«ProHne, Q^Glutamine, R«Arginine, Serine, 
T«Threonine, V»Valine, W=Tryptophan, Y-Tyrosint, 
X^Uoknown, *=Stop codon, /^possible nucleotide deletion. 
Impossible nucleotide insertion 










CUT f> CUT A D DVD OnPD A PPT TXT* t r A r »r*r» T T/\r»T>» t 

EHLRSPZ*GARrYRCoDCRASFLDRVALTRHQET^ 
TQEKPrWEDPPPEAVTLSTDQEGEGETPTPTESS 
SHGEGQNPKTLVEEKPYLCPECGAGFTEVAALLL 
HRSCHPGVSL 


3930 


A 


513 


273 


KTQETHIYISEHIFFPFLQGFGNLPICMAKTDLSLS 

HQPDKKGVPSDFILPISDVRASIGAGFIYPLVGTG 

SRESPLWL 


3931 


A 


16 


305 


KRRDFLSCWPAFTVLGEARGDQVDWSKLYRDT 
GLVKMSRKPRASSPFSNNHPSTPKRRGRGKHPLI 
PGPEAJLSKFPRQPIREKGPVKEVPGTKGSP 


3932 


A 


16 


305 


KRRDFLSCWPAFTVLGEARGDQVDWSKLYRDT 
GLVKMSRKPRASSPFSNNHPSTPKRRGRGKHPLI 
PGPEALSKFPRQPIREKGPVKEVPGTKGSP 


3933 


A 


1 


1546 


STHASEHWDSALQLAKHLAPDQIPF1SKEYAIQLE 

FAGDYVNALAHYEKGITGDNKEHDEACLAGVA 

QMSIRMGDIRRGVNQALKHPSRVLKRDCGAILE 

NMKQFSEAAQLYEKGLYYDKAASVY1RSKNWA 

KVGDLLPHVSSPKIHLQYAKAKEADGRYKEAW 

AYENAKQWQSVIRIYLDHLNNPEKAVN1VRETQ 

SLDGAKMVARFFLQLGDYGSAIQFLVMSKCNNE 

AFTLA QQHNKMEIYADITG SEDTTNED YQS IAL Y 

FEGEKRYLQAGKFFLLCGQYSRALKHFLKCPSSE 

DNVAIEMA1ETVG QAKDELLTNQLIDHLLGEND 

GMPKDAKYLFRLYMALKQYREAAQTAIIIAREE 

QSAGNYRNAHDVLFSMYAELKSQKIKIPSEMAT 

NLMILHSYILVKJCHTVKNGDHMKGARMLIRVANN 

ISKFPSHIWILTSTVIECHRAGLKNSAFSFAAML 

MRPEYRSKTOAKYKKKIEGMVRRPDISEIEEATTP 

CPFCKFLLPESELL 


3934 


A 


334 


1268 


PTRRPILPLTSPKAISVPSPLQGKQHTLVKSCLSVS 

GIGGFLVSI^SRMKLQTLAVSVTALKFWSAYVP 

CQTQDRDALRLTLEQIDLIRRMCA SYSELELVTS 

AKALNDTQKLACLIGVEGGHSLDNSLSILRtFYM 

LGVRYLTLTHTCNTPWAESSAKGVHSFYNNISGL 

TDFGEKWAEMNRIX5MMVDLSHVSDAVARRAL 

EVSQAPVIFSHSAARGVCNSARNVPDDILQLLEE 

ERWAFVMVSLFHGELIQWQPIRPMCSTVADHFD 

HIKAWGSKFIGIGGDYDGAGKYRKKTTCKAPW 

RTSSRMSS 


3935 


A 


1 


883 


HETTPAVVQSVLIJERGWNKFDKQEQNAEDWNL 
YWRTSSrTlMTEHNSVKPWQQLNHHPGTTKLTR 

KTkPT A VHT V HTV/TD "D K/TVY1TQT VHTiTOT TTTVTl/rDMriV 
IVl^ ^I^/\AX1X# IvrjlVlXVLVJVl I VJl jL X v^Pl-r JLi If V iVLi iNJLJ I 

TKFVAEYFQERQMLGTKHSYWICKPAELSRGRG 

ILIFSDFKDFIFDDMYIVQKYISNPLLIGRYKCDLR 

rYVCVTGFKPLTIYVYQEGLVRFATEKFDLSNLQ 

NKYAHLTNSSINKSGASYEKJKEVIGHGCKWTLS 

RFFSYLR5WDVDDLLLWKKIHRMV1LTILAIAPS 

VPFAANCFELFGFDILIDDNEFHRTG 


3936 


A 


203 


441 


HUVHSLGPIJKHYQYCVRYLYYQVTKDVIKEFA 
DDGVKYLEIJtSTPRRENATGMTKKTYVESILEGI 
KQSKQENLDIDV 



471 



WO 01/57190 



PCT/US01/04098 



TABLE 7 





Fosition of end or 
oignai m aid mo acio 
oequence 


MaxS (MAXIMUM 


Mean 6 (Mean ocore) 


I 




/-» r\nr\ 
0.930 


ft /C0A 

U.ooU 


z 


24 


ft nzr>i 

0.964 


ft 3 

U.oOJ 


1 
5 


ft 1 

21 


ft nnn 
0.990 


ft nm 


A 

4 


19 


ft no i 

0.981 


ft ft>IO 

0.942 


c 
5 


22 


ft nt\ i 

0.991 


ft ftftO 

0.928 


o 


ft 1 

21 


0.956 


ft OjII 

0.843 


8 


22 


0.913 


0.718 


9 


17 


0.997 


0.969 


1 1 


19 


0.930 


0.680 


13 


36 


0.983 


0.863 


14 


28 


0.935 


0.839 


15 


21 


a. 

0.997 


1 0.955 


16 


16 


0.983 


0.944 


17 


18 


a a a 

0.989 


0.884 


19 


49 


0.996 


0.719 


20 


28 


0.972 


0.920 


21 


23 


0.954 


0.905 


22 


46 


a. « « 

0.955 


I 0.568 


23 


26 


0.942 


0.654 


24 


19 


A A*W A 1 

0.979 


0.94 1 


25 


34 


a on i ^ 

0.884 


0.565 


26 


33 


^\ a** jk 

0.934 


0.584 


27 


17 


r\ A*-r F 

0.975 


0.9 14 


nn 

28 


18 


0.980 


0.934 


29 


23 


0.928 


0.718 


30 


26 


0.978 


0.885 


32 


20 


A. f\ A *T 

0.946 


#\ 'VIA 

0.719 


33 


29 


0.933 


0.671 


35 


25 


0.996 


n non 

0.920 


36 


26 


0.903 


0.579 


40 


19 


0.981 


ft ftJtn 

0.942 


A 1 

47 


25 


0.971 


A' nnn 

0.909 


53 


22 


0.991 


n' n*"»o 

0.928 


55 


24 


0.960 


0.808 


60 


19 


ft ft 'O ^ 1 

0.986 


ft ft^T 

0.967 


78 


22 


ft fti o 
0.913 


ft 'fl o 


50 


*>n 

20 




ft cc< 


87 


24 


ft ft oo 

0.982 


ft o on 
0.889 


So 


17 


ft ftft*7 

0.997 


n ft^n 
0.969 


115 


1 r\ 

19 


ft o*a ft 

0.930 


ft £ Oft 


1 1A 

134 


36 


ft ftOQ 

0.983 


ft 


136 


17 


ft rv 1 1 

0.913 


ft £.r\£. 

0.696 


137 


19 


ft ft c o 

0.958 


ft ft/\f 
0.905 




Zo 






143 


32 


0.914 


0.740 


153 


21 


0.997 


0.955 


154 


25 


0.913 


0.583 


155 


29 


0.972 


0.857 


169 


30 


0.977 


0.817 


170 


30 


0.977 


0.819 


171 


30 


0.977 


0.819 


175 


47 


0.926 


0.606 


176 


30 


0.968 


0.872 


177 


22 


0.957 


\ 0.791 


192 


43 


0.930 


0.678 



472 



WO 01/57190 



PCT/US01/04098 



SEQ H> NO: 


Position of end of 
Signal in Amino Acid 
Sequence 


MaxS (MAXIMUM 

* ~ AM mmmm' » A * 0mm Wmm ^mmm m w my ^m m ~ m* 

SCORE) 


MeanS (Mean Score) 

m * m\^rmm mm mm m m » ■ • mm mr ^m m ^m w 


195 


19 


0.956 


0.860 


202 


21 


0.982 


0.871 


203 


24 


0.957 


0.870 


207 


23 


0.954 


0.905 


224 


46 


0.955 


0.568 


225 


26 


0.942 


0.654 


228 


45 


0.961 

wmr ^m m\ 


0.839 


231 


28 


0.994 


0.937 


232 


28 


0.993 


0.896 


234 


19 


0.979 


0.942 


235 


19 


0.979 


0.941 


238 


20 


0 987 


0 943 "1 


244 


23 


0 929 

^Jmmw Ammr 


0 683 


250 


34 

mw ¥ 


0 884 


0 565 


256 


33 


0 934 


0 584 


258 

***** 


25 

mm 


0 934 ^ 

V • ^ mT * 


0 729 


259 


22 

mm mm 


0 969 


0 871 


264 

m^ W V 


19 

A mw 


0 952 


0 753 

V • f m* mw 


265 


17 

A § 


0 975 


0 914 


266 


17 

A S 


0 975 


0 914 


271 


23 


0 974 


0 884 


274 


13 


0 971 


0 834 


275 


18 


0.980 


0 934 


278 


32 

mr mm 


0.958 


0 668 


280 


24 


0.966 


0.881 


281 


24 

mm w 


0.966 


0.881 


286 


23 


0.928 

^# * mm 


0.718 


291 


35 


0.991 


0 824 


293 


27 


0.956 


0 806 


294 

mm^ ■ 


23 ^ 


0 952 


0 827 


301 


26 

mm %f 


0.978 


0.885 


316 


20 


0.946 


0 719 

\J m 9 A mr 


320 


28 


0.978 


0.726 


327 


29 


0.933 


0.671 


331 


48 


0.903 


0.571 

* mw ■ * 


345 


25 


0.996 

m m* *F 


0 920 


349 


26 


0.903 


0.579 


351 


24 


0.951 


01876 


352 


18 


0.944 


0.716 


353 


32 


0.992 


0.854 


354 


27 


0.945 


0.817 


355 


16 


0.922 


0.716 


356 


13 


0.959 


0.818 


357 


23 


0.986 


0.878 


358 


19 


0.904 


0.671 


359 


16 


0.988 


0.951 


360 


15 


0.981 


0.938 


361 


18 


0.944 


0.716 


362 


21 


0.984 


0.869 


363 


40 


0.979 


0.813 


364 


18 


0.883 


0.693 


365 


22 


0.962 


0.908 


366 


22 


0.961 


0.827 


367 


44 


0.941 


0.624 


368 


20 


0.952 


0.791 


369 


22 


0.949 


0.840 


370 


28 


0.957 


0.682 
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SEQ ED NO: 


Position of end of 
Sienal in Amino Acid 
Sequence 


MaxS fMAXTMUM 
SCORE1 


MeflnS /Mefln Scnrp^ 


372 


28 


0.974 


0.894 


373 


19 


0.972 


0 947 1 


374 


29 


0.968 


0 785 


375 


19 


0 949 


0 897 


377 


23 


0 962 

\JmZm 


0 910 


378 


31 


0 974 


0 895 


379 


26 


0 969 


0 939 


380 


27 


0 945 


0 817 


383 


27 


0 945 


0 817 


384 


25 


0 992 


0 877 


385 


32 

%wwmm 


0 983 


0 825 


386 

ww \J \J 


44 


0 924 




387' 


26 


0 971 

V.7 / A 


0 804 


388 


19 


0 989 

07 


0 862 


389 


24 


0 990 


0 947 


390 

mW 4r W 


34 


0 942 




391 


16 ^ 


0 992 


0 716 

v. / AO 


394 


19 


0 987 


0 970 


398 

%\r mr w 


36 


0 992 


0 866 


404 


13 


0 959 
v. 7./ 7 


0 818 

v.O lO 


417 


23 


0 986 

U.7UV 


O 878 
V.O /o 


421 


19 


0 904 


0 671 

V.O / 1 


425 


28 


0 971 


0 717 

V. /LI 


431 


16 


0 988 


0 951 


452 


18 


0 944 


0 716 

V. ( 1U 


459 


21 

mm 4 


0 991 


0 902 


468 


21 


0 984 


0 869 ! 


478 


40 


0 979 


0 813 

V.O X J 


486 

low 


18 


0 883 


0 692 


499 


22 




0 Q08 


501 


19 


0 962 


0 877 

V.O / / 


514 


44 


0 941 


0 624 


529 

*mfmm mT 


20 

mm W 


0 952 


0 791 

V. / 7 A 


533 

■■T WW mm 


39 


0 914 

\Jmm* K » 


0 719 


548 


28 


0 957 


0 682 


561 


28 


0 974 


0 894 


562 


28 


0 974 


0 893 


564 


18 


0 949 


0 806 

V.O vv 


576 


19 


0 972 

\m m^r ¥ mm 


0 947 


584 


29 

mm m 


0 968 

^mmm* ^f%m 


0 785 


585 


28 


0 973 

\mwm* 9 


0 810 


591 


19 


0 949 

^m mmW m* 


0 897 


592 


24 


0.991 

%m mr * 


0 954 


594 

mm 0T ■ 


20 


0 985 


0 959 


595 


20 


0 985 1 

\f mm* \3 *J 


0 959 i 


612 


23 

mmmf 


0 962 

\J • ^ \Jmm 


0 910 


619 

* m* 


31 


0 974 


0 895 1 


621 


15 


0.959 


0.795 


633 


26 


0.969 


0.939 


640 


20 


0.949 


0.842 


645 


25 


0.911 


0.759 


684 


25 


0.992 


0.877 


691 


32 


0.983 


0.825 


698 


44 


0.924 


0.564 


700 


19 


0.982 


0.941 


710 


26 


0.971 


0.894 


714 


23 


0.965 


0.907 
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position of end oj 
dignai in Ammo stcia 


M av o fMA YIM1 TM 

QffYi>ir\ 


ivienno {jvicsin score/ 


718 

f l o 


10 


n Q8Q 

u.707 


0 R62 


72 S 


91 


0 076 


0 RSI 


728 






V.07J 




7^ 


U.7OJ 




/Hi 




ft 047 




744 


1 Q 


ft O^O 

U.7J7 


ft 074 


HAH 


lo 


U.722 


u. / 10 


7^A 
/JO 


20 


ft 

u.y / j 


ft fi£A 


/O/ 


22 


ft OQ< 

u.yoo 


ft OA1 


7£Q 
/Oo 


0*7 

27 


ft 01 a 


ft o<o 
U. /jo 


7£Q 

/oy 


1 0 

iy 


U.yo/ 


ft 07/1 

u.y /u 


T7A 


22 


A OOI 

U.yol 


A Oil 


/ /I 


34 


A QO"3 


A OOl 
U.07J 


/ 13 


20 


U.VOo 


A Q1Q 

u.yjy 


11 A 


0 i 
21 


A ATI 

0.971 


A' QA C 


'inn ^ 
1 to 


OO "~ 

22 


A QBC 


ft O/ll 


tiq 

1 17 


32 


ft Q*7Q 


ft Q.Afi 
U.64D 


/oi 




A OCA 


ft O^T 


*7C^ 
iOJ 


27 


VJjy lo 


ft *7<fi 


/oo 


27 


A OI JC 


A OCD 

U. Oo 


TQO 

/oo 


22 


A hDI 

u.yoi 


A 

U.y33 




22 


A nor 

U.yoo 


A oTvi 


no a 

fyq 


39 1 


A OQO 


U.OD4 


TOT 


0*7 

27 




A B>1*7 


Bin 


22 


A OD 1 

u.yo 1 


A o^n 


02J 


1A 

34 




ft 9Q1 


OX J 


17 


A A£0 


A TOO 

U. / /o 


OJ / 


2U 


ft 0£ff 

u.yoo 


ft old 


QAA 


25 




ft Q<1 


QA C 

o4j 


17 


A A 1 A 

u.y iy 


A *7AiC 1 

U. /Uo 


QA£. 

o4o 


21 


A A*7t 

o.y /i 


A A/l< 


(Ml 


21 


A A*71 

\).y / 1 


U.y4D 




. — — , 

22 


A OQA 

u.yoo 


ft QA1 


SOI 
070" 


24 


u.y / A 






nA 

24 


A C71 

u.y f 1 


ft ftA^ 


896 


32 


0 973 


0 846 


899 


31 


0.982 


0.817 


922 


15 


0.882 


0.706 


924 


21 


0.975 


0.948 


925 


21 


0.927 


0.661 


933 


20 


0.967 


0.906 


960 


20 


0.967 


0.906 


967 


$8 


0.970 


0.784 


968 


47 


0.970 


0.557 


972 


36 


0.945 


0.775 



TABLE 8 



SEQ 

ED 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic 
Add, E=GIutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleuclne, K^Lysine, Lf=Leucine, 
M=Methionine t N=As para gin e, P=ProIine, Q=Glutamine, 
R»Arginine, S=Serine, TsThreonlne, V«Valine, 
W«Tryptophan, Y»Tyrosine, X=Un known, *=Stop cod on, 
/^possible nucleotide deletion, V^possible nucleotide 
insertion 


3955 


A 


235 


1272 


G PREVL AA S SL ADG SEEQ VMAV AL VRERJDLSFPG 
VGDAVVNPTRWHLPAQPEMLYEGGEGRMETLK 



475 



wo 



01/57190 



PCT/US01/04098 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D»Aspartic 
Acid, E=Glutamic Acid, F=PhenylaIanine,G=Grycine, 
H=Histidine, I=IsoIeucine, K«Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, Q^Glutnmine, 
R~Arginine, S=Serine, T=*Threonine, V=Valine, 
W»Tryptophan f Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible nucleotide 
insertion 










DKTLQELEELQNDSEAJDQLALESPEVQDLQLERE 

MALATNRSLAEKNLEFQGPLEISRSNLSDRYQELR 

KLVERCQEQKAKLEKFSSALQPGTLLDLLQVEGM 

KJEEESEAMAEKFLEGEVPLETFLENFSSMRMLSH 

LRRVRVEKLQEWRKPRASQELAGDAPPPRSPPP 

V/PPSPPGNTPCG*RAAAATISHASLPFALQPIPQPA 

CGPHCPWSPATGPFPSSVPALLLQRASGPHLPGSP 

AWTQGCCGLLLVPTEEHAAPPYGFPPPPGPAWPG 

Y 


3956 


A 


821 


385 


SICADRTERVGIFFYIPAGTTDEADVTHP*EGHSYL 
SNHAGIQRSSRP/SH YQGEAVHDNCK1 ADELQLLT 
YQLCHTYVRCTRSVSIPAPAYYAHLVAFRARYHL 
VDKEHDSAEGSHVSGQSNGRDPQALAKAVQIHQ 
DTLRTMYFA 


3957 


A 


4621 


240 

» 


ELISTFKXLLEKKRSEVMKMKKRYEVGLEKLDSA 

SSQVATMQMELEALHPQLKVASKEVDEMMIMIE 

KESVEVAKTEK1VKADETIANEQAMASKAIKDEC 

DADLAGALPILESALAALDTLTAQDITWKSMKSP 

PAGVKXVMEAICILKGIKADK1PDPTGSGKKIEDF 

WGPAKRLLGDMRFLQSLHEYDKDNIPPAYMNIIR 

KNYIPNPDFVPEKIRNASTAAEGLCKWVIAMDSY 

DKVAKIVAPKKIKLAAAEGELKIAMDGLRKKQA 

ALKEVQDKLARLQDTLELNKQKKADLENQVDLC 

SKKLERAEQLIGGLGGEKTRWSHTALBLGQLYIN 

LTGD1L1SSGVVAYLGAFTSTYRQNQTKEWTTLCK 

GRDIPCSDDCSLMGTLGEAVTIRTWNIAGLPSDSF 

SIDNGinMNARRWPLMIDPQSQANKWIKNMEKA 

NSLYV1KLSEPDYVRTLENCIQFGTPVLLENVGEE 

LDPILEPLLLKQTFKQGGSTCIRLGDSTEEYAPDFR 

FYlTTKLRNPHYLPETSVK\nTLLNFMITPEGMQDQ 

LLGIWAQERPDLEEEKQALDLQGAENKRQLKEIE 

DKIIJEVLSSSEGNBLEDETAIKILSSSKALANEISQK 

QEVAEETEKKIDTTRMGYRPIAJHSSELFFSLADLA 

MEPMYQYSLTWFINLFILSESNSEKSEILAKRLQIL 

KDHFTYSLYVNVCRSLFEKDKLLFSFCLTINLLLH 

ERAINKAEWRFLLTGGIGLDNPYANPCTWLPQKS 

WDEICRIJDDLPAFKTTRREFMRLKDGWKKVYDSL 

EPHHEVFPEEWEDKANEFQRMLIIRCLRPDKVIPM 

LQEFIINRLGRAFffiPPPFDLAKAFGDSNCCAPLIFV 

LSPGADP1V1AALLKFADDQG YGGSKLS SLSLGQGQ 

GPIAMKMLEKAVKEGTWVVLQNCHLATSWMPT 

LEKVCEELSPESTHPDFRMWLTSYPSPNFPVSVLQ 

NGVKMTNEAPKGLRANHRSYLMDPISDPEFFGSC 

KKPEEFKKLLYGLCFFHALVQERRKFGPLWWNIP 

YEFNETDLRISVQQLHMFLNQYEELPYEALRYMT 

GECNY GGRVTDDWDRRTLRSILNKFFNPELVENS 

DYKFDSSGIYFVPPSGDHKSYIEYTKTLPLTPAPEI 

FGMNANADITKDQSETQLLFDNILLTQSRSAGAG 

AKSSDEVVNEVASDILGKLPNNFDIEAAMRRYPT 

TYTQSMhTTVLVQEMGRFNKLLKTIRDSCVNIQKA 

EKGLAVMSTDLEEWSSILNVKIPEMWMGKSYPS 

LKPLGSYVNDFLARLKFLQQWYEVGPPPVFWLSG 

FFFTQAFLTGAQQNYARKYTIPIDLLGFDYEVMED 

KEYKHPPEDGVF1HGLFLDGASWNRKIKKLAESH 

PKILYDTVPVMWLKPCKRADEPKRPSYVAPLYKT 
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SEQ 

ID 

NO: 

* 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic 
Acid, E=Glutamic Acid, F=PbenylaIaoine, G=Glycine, 
H=Histidine, I=Isoleucine, KHLysine, L= Leucine, 
M=Methionine, N=Asparagine, P=Proline, Q=GIutamine, 
R=Arginine, S=Serine, T=Threonine, V^Valine, 
W-Tryptophan, Y^Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, V=possible nucleotide 
insertion 










SERRGVLSTTGHSTNFVIAVMTLPSDQPKEHWIGR 
GVALLCQLNS 


3958 


A 


35 


529 


GADMAKSKNHTIHNQSRKWHRNVIKKPLSQRYK 

SLKGVDPKFLG>^CFTKKHKKKGLKKMQADSA 

KAVSTCAKAEEALVKPKEVKPKIPKGVSCELN*LA 

Y1AYPKFWTCACACIAKGLRLCQPKAKAQDQTK 

AQVQKAQAAAPASVPTQAPKGAQAPTKASG 


3959 


A 


1883 


763 


LLVLLLRTNLLIASSTRISRATLTCSPPG1PVDPRVR 

PRVRSHLVMYLGITTGSLHKAVVSGDSSAHLVEEI 

QLrTDPEPVRNLQLAPTQGAVFVGFSGGVWRVPR 

ANCSVYESCVDCVLARDPHCAWDPESRTCCLLSA 

PNLNSWKQDMERGNPEWACASGPMSRSLRPQSR 

PQEDKEVLAVPNSILELPCPHLSALASYYWSHGPAA 

VPEASSTVYNGSLLLIVQDGVGGLYQCWATENGF 

SYPVISYWVDSQDQTLALDPELAGIPREHVKVPLT 

RVSGGAAIAAC^SYWPHFVTVTVLFALVLSGALI 

ILVASPLRALRARGKVQGCETLRPGEKAPLSREQH 

LQSPKECRTSASDVDADNNCLGTEVA 


3960 


A 


1 


481 


SYAAPSLFVKSLYWALAFMAVLLAVSGVVTVVLA 

SRAGARCQQCPPGWVLSEEHCYYFSAEAQAWEA 

SQAFCSAYHATLPLLSHTQDFLGRYPVSRHSWVG 

AWRGPQGWHWIDEAPLPPQLLPEDGEDNLDINCG 

ALEEGTLVAANCSTPRPWVCAKGTQ 



TABLE 9 



SEQ ID NO: 


Accession 
Number 


Species 


Description 


Smith 

Waterman 

Score 


% Idenity 


3937 

* 


Y27700 


Homo sapiens 


Human secreted 
protein encoded by 
gene No. 12. 


193 

* 


25 


3938 


AF093097 


Homo sapiens 


putative RNA-binding 
protein Q99 


3881 


84 


3939 


ABO 12308 


Anthocidaris 
crassispina 


B2HC 


4169 


74 


3940 


U 10248 


Homo sapiens 


ribosomal protein L29 


787 


95 


3941 


Y99418 

• 


Homo sapiens 


Human PR01317 
(UNQ783) amino acid 
sequence SEQ ID 
NO:277. 


4031 


100 


3942 


AL023516 


Gall us gallus 


B locus C type Lectin 


198 


35 



TABLE 10 



SEQH) 
NO: 


Accession No. 


Description 


Results* 


3937 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.168e-l 1 209- 
224 


3942 


BL00615 


C-type lectin domain proteins. 


BL00615A 16.68 6.400e-ll 37- 
55 



* Results Include in order: accession number subtype; raw score; p-value; position of signature in amino acid 
sequence 
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TABLE 1 1 



SEQH) 
NO: 


PFAM Name 


Description 


P-Value 


PFAM 
Score 


3938 


Piwi 


Piwi domain 


2.6e-150 


512.7 


3940 


Ribosomal L29e 


Ribosomal L29e protein family 


2.3e-19 


77.8 


3941 


Sema 


Sema domain 


4e-181 


615.1 


3942 


lectin c 


Lectin C-type domain 


0.086 


-7.1 ! 



TABLE 12 



SEQ ID NO: 


Position of end of 
Signal in Amino Acid 
Sequence 


MaxS (Maximum Score) 


Means (Mean Score) 


3941 


31 


0.985 


0.926 


3942 


21 


0.974 


0.894 


TABLE 13 



SEQ ID NO: 
of full length 
nucleotide 
sequence 


SEQ ID 
NO: of full 
length 
peptide 
sequence 


SEQ ID NO: 
of COD tig 

nucleotide 
sequence 


SEQ ID NO: 
of contig 
peptide 
sequence 


Priority Docket 
number 

corresponding SEQ 
D> NO: in priority 
application 


SEQ ID NO: in 
USSN 09/496,914 


3937 


3943 


3949 


3955 


787C1P2G 1 


787 3587 


3938 


3944 


3950 


3956 


787C1P2G 2 


787 3813 


39,?9 


3945 


3951 


3957 


787CIP2G 3 


787 4462 


3940 


3946 


3952 


3958 


787CIP2G 4 


787 4887 


3941 


3947 


3953 


3959 


787CIP2G 5 


787 5794 


3942 


3948 


3954 


3960 


787CIP2G 6 


787 8743 



TABLE 14 



TISSUE ORIGIN 


LIBRARY/ 
RNA SOURCE 


HYSEQ LIBRARY 
NAME 


SEQ ID NOS: 


adult brain 


GIBCO 


ABD003 


3940 


adult brain 


Clontech 


ABR006 


3940 


adult brain 


Invitrogen 


ABR014 


3940 


cultured preadipocytes 


Strategene 


ADP001 


3937 


adult heart 


GIBCO 


AHR001 


3940 


adult kidney 


GIBCO 


AKD001 


3940 


adult lung 


GIBCO 


ALG001 


3940 


young liver 


GIBCO 


ALV001 


3940 


adult ovary 1 


Invitrogen 


AOV001 


3938, 3940-3941 


adult spleen 


GIBCO 


ASP001 


3940-3941 


testis 


GIBCO 


ATS001 


3940 


bone marrow 


Clontech 


BMD001 


3938, 3940 


bone marrow 


Clontech 


BMD004 


3940 


adult cervix 


BioChain _j 


CVX001 


3940 


endothelial cells 


Strategene 


EDT001 


3940 


fetal brain 


Clontech 


FBR006 


3940 


fetal brain 


Invitrogen 


FBT002 


3940-3941 


fetal heart 


Invitrogen 


FHR001 


3940 


fetal kidney 


Clontech 


FKD001 


3940 


fetal kidney 


Clontech J 


FKD002 


3940 
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T TDD A t> ~\S 1 

"DMA coiTDrr 


UVCPA I TDD A DV 

M A 7Y/f E 1 
fN A1VJLH, 


CPA TT\ MAC. 

oJEQ ID 1NOS: 


recai uver-spieen 


v^oiumDia 
University 


tji cnni 


1Q1"7 1QAf\ 

35*3 /, 3SJ4U 


ieiai iiver-spieen 


^oiumoia 
university 


•pf QAflO 


10*JQ 1Q/11 


icuu nvcr-apiecn 


University 


pi c.nm 




ieiaj Jiver 


i^joiuecn 


T?I VAO/i 
I*J-fVUU^ 


1Q4A 


reiai SKUi 


jjivuTogen 


TJCVAAI 

r oivuu 1 




ieiai spieen 


■Oioi^nam 


r orUU I 


3y4u 


ieiai Drain 




xlrtJUUl 


10*5*7 IQiin OA/11 

35/3 /, 3y4u-3y4i 


iniani Drain 


^oiumoia 
university 




ion 1010 

jyiv, 393 y, 3941 


leuKocyie 




T TTWIAI 


lO/fA lO/f 1 


icuKocyie 


ciontecn 




1QA(\ lO/l 1 


melanoma trom cell Ime Al CC 

rrv^JtvL» IHXH 


Clontecn 


Tk yTCT A Ail 


35^40 


fiiaininaiy giano 


juivnrogcn 


IVLIVlVJUU 1 


lOII lOAft lOill 


neuronal cpIIq 


S trate tr en e 


NTU001 


3937 3942 


prostate 


Clontech 


PRT001 


3938 


rectum 


Invitrogen 


REC001 


3940 


salivary gland 


Clontech 


SALs03 


3941 


small intestine 


Clontech 


SIN001 


3940 


skeletal muscle 


Clontech 


SKM001 


3940 


spinal cord 


Clontech 


SPC001 


3940 


thymus 


Clontech 


THMc02 


3938 


thyroid gland 


Clontech 


THR001 


3942 


uterus 


Clontech 


UTR001 


3940 
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WHAT IS CLAIMED IS: 

1 . An isolated polynucleotide comprising a nucleotide sequence selected from the group 
consisting of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954, a full length protein 
coding portion of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954, a mature protein 
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, an active domain 
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, and complementary 
sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide hybridizes to the polynucleotide of claim 1 under stringent hybridization 
conditions. 

■ 

3. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide has greater than about 90% sequence identity with the polynucleotide of claim 1. 

* 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

■ 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1. 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 operatively 
associated with a regulatory sequence that modulates expression of the polynucleotide in the host 
cell. 

1 0. An isolated polypeptide, wherein the polypeptide is selected from the group consisting of: 

(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent conditions 
with any one of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. 
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11. A composition comprising the polypeptide of claim 10 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 

* 

13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polynucleotide of claim 1 for a period sufficient to form the complex; and 

b) detecting the complex, so that if a complex is detected, the polynucleotide 
of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions with 
nucleic acid primers that anneal to the polynucleotide of claim 1 under such conditions; 

b) amplifying a product comprising at least a portion of the polynucleotide of 

claim 1; and 

c) detecting said product and thereby the polynucleotide of claim 1 in the 

sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the method 
further comprises reverse transcribing an annealed RNA molecule into a cDNA polynucleotide. 

■ 

16. A method for detecting the polypeptide of claim 10 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polypeptide under conditions and for a period sufficient to form the complex; and 

b) detecting formation of the complex, so that if a complex formation is 
detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form apolypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound complex is 
detected, a compound that binds to the polypeptide of claim 10 is identified. 
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18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a cell, under 
conditions sufficient to form a polypeptide/compound complex, wherein the complex drives 
expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence expression, so 
that if the polypeptide/compound complex is detected, a compound that binds to the polypeptide 
of claim 10 is identified. 

19. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected fromm 
the group consisting of SEQ ID NO: 1-984; 1969-2952, 3937-3942 or 3949-3954, a mature 
protein coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, an active 
domain coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, 
complementary sequences thereof and a polynucleotide sequence hybridizing under stringent 
conditions to SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, under conditions 
sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 

20. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of any one of the polypeptides SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 

3955-3960, the mature protein portion thereof, or the active domain thereof. 

* 

21 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or3949-3954. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of the 
polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of the 
polynucleotides in the collection. 
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26. The collection of claim 22, wherein the collection is provided in a computer-readable 
format. 

21: A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 and a 
pharmaceutically acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising an antibody that specifically binds to a 
polypeptide of claim 1 0 or 20 and a pharmaceutically acceptable carrier. 



483 



WO 01/57190 



PCT/DS01/04098 



Pages 485 to 6221 of this application contain ami 







ne aaaress a 




Les pages % to 6221 de cede demande contiennent des listages des sequences 








CH-1211Geneve20 



i 



